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1 

RECOMBINANT EXPRESSION VECTORS AND 
PURIFICATION METHODS FOR THERMUS THERMOPHILUS 

DNA POLYMERASE 

The present invention relates to a purified, thermostable DNA polymerase 
5 purified from Thermus thermophilus and recombinant means for producing the 
enzyme. Thermostable DNA polymerases are useful in many recombinant DNA 
techniques, especially nucleic acid amplification by the polymerase chain reaction 
(PGR). 

Extensive research has been conducted on the isolation of DNA polymerases 
10 from mesophilic microorganisms such as E. coli . See, for example, Bessman £t al 
1957, 1 Biol . Chem . 222:171-177 and Buttin and Romberg, 1966, 1 Biol. Chem . 
241:5419-5427. 

Much less investigation has been made on the isolation and purification of DNA 
polymerases from thermophiles such as Thermus thermophilus . Kaledin £t al., 1980, 

15 Biokhvmiva 45:644-65 1 disclose a six-step isolation and purification procedure of 
DNA polymerase from cells of T. aquaticus YT-1 strain. These steps involve isolation 
of crude extract, DEAE-cellulose chromatography, fractionation on hydroxyapatite, 
fractionation on DEAE-cellulose, and chromatography on single-strand DNA-cellulose. 
The pools from each stage were not screened for contaminating endo- and 

20 exonuclease(s). The molecular weight of the purified enzyme is reported as 62,000 
daltons per monomelic unit. 

A second purification scheme for a polymerase from Thermus aquaticus is 
described by Chien fit al-* 1976, 1. Bacteriol. 12Z: 1550- 1557. In this process, the 
crude extract is applied to a DEAE-Sephadex column. The dialyzed pooled fractions 

25 are then subjected to treatment on a phosphocellulose column. The pooled fractions are 
dialyzed and bovine serum albumin (BS A) is added to prevent loss of polymerase 
activity. The resulting mixture is loaded on a DNA-cellulose column. The pooled 
material from the column is dialyzed and analyzed by gel filtration to have a molecular 
weight of about 63,000 daltons and by sucrose gradient centrif ugation of about 68,000 

30 daltons. 

The use of thermostable enzymes, such as those prepared by Chien et al. and 
Kaledin filal., to amplify existing nucleic acid sequences in amounts that are large 
compared to the amount initially present was described in U.S. Patent Nos. 4,683,1 95 ; 
4,683,202; and 4,965,1 88, which describe the PCR process. Primers, template, 
35 nucleoside triphosphates, the appropriate buffer and reaction conditions, and a 

polymerase are used in the PCR process, which involves denaturation of target DNA, 
hybridization of primers, and synthesis of complementary strands. The extension 
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prcxiuct of each primer becomes a template for the production of the desired nucleic acid 
sequence. The patents disclose that, if the polymerase employed is a thermostable 
enzyme, then polymerase need not be added after every denaturation step, because heat 
will not destroy the polymerase activity. 
5 European Patent Publication No. 258,017; PCT Publication No. 89/06691; and 

U.S. Patent No. 4,889,818 describe the isolation and recombinant expression of an 
-94 kDa thermostable DNA polymerase from Thermus aquaticus and the use of that 
polymerase in PCR. Although T. aquaticus DNA polymerase is especially preferred 
for use in PCR and other recombinant DNA techniques, there remains a need for other 

10 thermostable polymerases. 

Accordingly, there is a desire in the art to produce a purified, thermostable DNA 
polymerase that may be used to improve the PCR process described above and to 
improve the results obtained when using a thermostable DNA polymerase in other 
recombinant techniques such as DNA sequencing, nick-translation, and even reverse 

15 transcription. The present invention helps meet that need by providing recombinant 
expression vectors and purification protocols for Thermus thermophilus DNA 
polymerase. 

Accordingly, the present invention provides a purified thermostable enzyme that 
catalyzes combination of nucleotide triphosphates to form a nucleic acid strand 

20 complementary to a nucleic acid template strand. The purified enzyme is the DNA 
polymerase from Thermus thermophilus (Tth) and has a molecular weight predicted 
from the nucleic acid sequence of the gene of about 94 kDa. This purified material 
may be used in a temperature-cycling amplification reaction wherein nucleic acid 
sequences are produced from a given nucleic acid sequence in amounts that are large 

25 compared to the amount initially present so that the sequences can be manipulated 
and/or analyzed easily. 

The gene encoding Tth DNA polymerase enzyme from Thermus thermophilus 
has also been identified and cloned and provides yet another means to prepare the 
thermostable enzyme of the present invention. In addition to the gene encoding the Tth 

30 enzyme, gene derivatives encoding Tth DNA polymerase activity are also provided. 

The invention also encompasses a stable enzyme composition comprising a 
purified, thermostable Tth enzyme as described above in a buffer containing one or 
more non-ionic polymeric detergents. 

Finally, the invention provides a method of purification for the thermostable 

35 polymerase of the invention. This method involves preparing a crude extract from 
Thermus theimophilus cells, adjusting the ionic strength of the crude extract so that the 
DNA polymerase dissociates from nucleic acid in the extract, subjecting the extract to 
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hydrophobic interaction chromatography, subjecting the extract to DNA binding protein 
affinity chromatography, and subjecting the extract to cation or anion exchange or 
hydroxyapatite chromatography. In a preferred embodiment, these steps are carried out 
sequentially in the order given above, and non-ionic detergent is added to the extract 
5 prior to the DNA binding protein affinity chromatography step. The nucleotide binding 
protein affinity chromatography step is preferred for separating the DNA polymerase 
from endonuclease proteins. 

The present invention provides DNA sequences and expression vectors that 
encode Tth DNA polymerase. To facilitate understanding of the invention, a number of 
10 terms are defined below. 

The terms "cell," "cell line," and "cell culture" can be used interchangeably and 
all such designations include progeny. Thus, the words "transformants" or 
"transformed cells" include the primary transformed cell and cultures derived from that 
cell without regard to the number of transfers. All progeny may not be precisely 
15 identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny 
that have the same functionality as screened for in the originally transformed cell are 
included in the definition of transformants. 

The term "control sequences" refers to DNA sequences necessary for the 
expression of an operably linked coding sequence in a particular host organism. The 
20 control sequences that are suitable for procaryotes, for example, include a promoter, 
optionally an operator sequence, a ribosome binding site, and possibly other 
sequences. Eucaryotic cells are known to utilize promoters, polyadenylation signals, 
and enhancers. 

The term "expression system" refers to DNA sequences containing a desired 
25 coding sequence and control sequences in operable linkage, so that hosts transformed 
with these sequences are capable of producing the encoded proteins. To effect 
transformation, the expression system may be included on a vector; however, the 
relevant DNA may also be integrated into the host chromosome. 

The term "gene" refers to a DNA sequence that encodes a recoverable bioactive 
30 polypeptide or precursor. The polypeptide can be encoded by a full-length gene 

sequence or by any portion of the coding sequence so long as the enzymatic activity is 
retained. 

The term "operably linked" refers to the positioning of the coding sequence 
such that control sequences will function to drive expression of the protein encoded by 
35 the coding sequence. Thus, a coding sequence "operably linked" to control sequences 
refers to a configuration wherein the coding sequences can be expressed under the 
control of a control sequence. 
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The term "mixture" as it relates to mixtures containing Tth polymerase refers to 
a collection of materials which includes Tth polymerase but which can also include 
other proteins. If the Tth polymerase is derived from recombinant host cells, the other 
proteins will ordinarily be those associated with the host Where the host is bacterial, 
5 the contaminating proteins will, of course, be bacterial proteins. 

The term "non-ionic polymeric detergents" refers to surface-active agents that 
have no ionic charge and that are characterized, for purposes of this invention, by an 
ability to stabilize the Tth enzyme at a pH range of from about 3.5 to about 9,5, 
preferably from 4 to 8.5. 

10 The term "oligonucleotide" as used herein is defined as a molecule comprised of 

two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and 
usually more than ten. The exact size will depend on many factors, which in turn 
depends on the ultimate function or use of the oligonucleotide. The oligonucleotide 
may be derived synthetically or by cloning. 

15 The term "primer" as used herein refers to an oligonucleotide, whether 

occurring naturally as in a purified restriction digest or produced synthetically, which is 
capable of acting as a point of initiation of synthesis when placed under conditions in 
which synthesis of a primer extension product which is complementary to a nucleic acid 
strand is initiated, i.e., in the presence of four different nucleoside triphosphates and 

2o the Tth thermostable enzyme in an appropriate buffer ("buffer" includes pH, ionic 

strength, cofactors, etc.) and at a suitable temperature. For Tth polymerase, the buffer 
preferably contains 1 to 3 mM of a magnesium salt, preferably MgCl2, 50-200 ]iM of 
each nucleotide, and 0.5 to 1 jjM of each primer, along with 50 mM KC1, 10 mM Tris 
buffer, pH 8-8.4, and 100 |Jg/ml gelatin (although gelatin is not required and should be 

25 avoided in some applications, such as DNA sequencing). 

The primer is single-stranded for maximum efficiency in amplification, but may 
alternatively be double-stranded. If double-stranded, the primer is first treated to 
separate its strands before being used to prepare extension products. The primer is 
usually an oligodeoxyribonucleotide. The primer must be sufficiendy long to prime the 

30 synthesis of extension products in the presence of the polymerase enzyme. The exact 
length of a primer will depend on many factors, such as source of primer and result 
desired, and the reaction temperature must be adjusted depending on primer length to 
ensure proper annealing of primer to template. Depending on the complexity of the 
target sequence, the oligonucleotide primer typically contains 15 to 35 nucleotides. 

35 Short primer molecules generally require cooler temperatures to form sufficiently stable 
complexes with template. 
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A primer is selected to be "substantially" complementary to a strand of specific 
sequence of the template. A primer must be sufficiendy complementary to hybridize 
with a template strand for primer elongation to occur. A primer sequence need not 
reflect the exact sequence of the template. For example, a non-complementary 
5 nucleotide fragment may be attached to the 5 ! end of the primer, with the remainder of 
the primer sequence being substantially complementary to the strand. Non- 
complementary bases or longer sequences can be interspersed into the primer, provided 
that the primer sequence has sufficient complementarity with the sequence of the 
template to hybridize and thereby form a template primer complex for synthesis of the 
10 extension product of the primer. 

The terms "restriction endonucleases" and "restriction enzymes" refer to 
bacterial enzymes which cut double-stranded DNA at or near a specific nucleotide 
sequence. 

The term "thermostable enzyme" refers to an enzyme which is stable to heat and 
15 is heat resistant and catalyzes (facilitates) combination of the nucleotides in the proper 
manner to form primer extension products that are complementary to each nucleic acid 
strand. Generally, synthesis of a primer extension product begins at the 3 ! end of the 
primer and proceeds in the 5' direction along the template strand, until synthesis 
terminates. 

2o The Tth thermostable enzyme of the present invention satisfies the requirements 

for effective use in the amplification reaction known as the polymerase chain reaction. 
The Tth enzyme does not become irreversibly denatured (inactivated) when subjected to 
the elevated temperatures for the time necessary to effect denaturation of double- 
stranded nucleic acids, a key step in the PCR process. Irreversible denaturation for 

25 purposes herein refers to permanent and complete loss of enzymatic activity. The 
heating conditions necessary for nucleic acid denaturation will depend, e.g., on the 
buffer salt concentration and the composition and length of the nucleic acids being 
denatured, but typically range from about 90 to about 105°C for a time depending 
mainly on the temperature and the nucleic acid length, typically from a few seconds up 

3Q to four minutes. Higher temperatures may be tolerated as the buffer salt concentration 
and/or GC composition of the nucleic acid is increased. The Tth enzyme does not 
become irreversibly denatured for relatively short exposures to temperatures of about 

9o-ioo°e. 

The Tth thermostable enzyme has an optimum temperature at which it functions 
35 that is higher than about 50°C. Temperatures below 50°C facilitate hybridization of 
primer to template, but depending on salt composition and concentration and primer 
composition and length, hybridization of primer to template can occur at higher 
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6 

temperatures (e.g*, 45-70°C), which may promote specificity of the primer elongation 
reaction. The higher the temperature optimum for the enzyme, the greater the 
specificity and/or selectivity of the primer-directed extension process. The optimum 
temperature for Tth activity ranges from about 50 to 90° C. 
5 The present invention provides the DNA sequence encoding a full-length 

thermostable DNA polymerase of Thermus thermophilus . This DNA sequence and die 
deduced amino acid sequence are depicted below. For convenience, the amino acid 
sequence of this Tth polymerase is numbered for reference, and other forms of the 
thermostable enzyme are designated by referring to changes from the full length, native 
10 sequence. 
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The DNA and amino acid sequences shown above and the DNA compounds 
that encode those sequences can be used to design and construct recombinant DNA 
egression vectors to drive expression of Tth DNA polymerase activity in a wide 
variety of host cells. A DNA compound encoding all or part of the DNA sequence 
5 shown above can also be used as a probe to identify thermostable polymerase-encoding 
DNA from other organisms, and the amino acid sequence shown above can be used to 
design peptides for use as immunogens to prepare antibodies that can be used to 
identify and purify a thermostable polymerase. 

Whether produced by recombinant vectors that encode the above amino acid 

10 sequence or by native Thermus thermophilus cells, however, Tth DNA polymerase will 
typically be purified prior to use in a recombinant DNA technique. The present 
invention provides such purification methodology. For recovering the native protein 
the cells are grown using any suitable technique. Briefly, the cells are grown on a 
medium, in one liter, of nitrilotriacetic acid (100 mg), tryptone (3 g), yeast extract (3 

15 g), succinic acid (5 g), sodium sulfite (50 mg), riboflavin (1 mg), K 2 HP0 4 (522 mg), 
MgS0 4 (480 mg), CaCl 2 (222 mg), NaCl (20 mg), and trace elements. The pH of the 
medium is adjusted to 8.0 ± 0.2 with KOH. The yield is increased up to 20 g of 
cells/liter if cultivated with vigorous aeration at a temperature of 70°C. Cells in the late 
logarithmic growth stage (determined by absorbance at 550 nm) are collected by 

20 centrifugation, washed with a buffer and stored frozen at -20°C. 

In another method for growing the cells, a defined mineral salts medium 
containing 0.3% glutamic acid supplemented with 0.1 mg/l biotin, 0.1 mg/l thiamine, 
and 0.05 mg/l nicotinic acid is employed The salts include nitrilotriacetic acid, CaSC>4, 
MgS0 4 , NaCl, KNO3, NaNQ 3 , ZnS0 4 , H3BO3, CuS0 4 , NaMoC> 4 , CoCl 2 , FeCl 3) 

25 MnS0 4 , and Na 2 HP0 4 . The pH of the medium is adjusted to 8.0 with NaOH. The 
cells are grown initially at 75°C in a water bath shaker. On reaching a certain density, 
one liter of these cells is transferred to a 14-liter f ermentor. Sterile air is bubbled 
through the cultures and the temperature maintained at 75°C. The cells are allowed to 
grow for eight hours before being collected by centrifugation. 

30 After cell growth, the isolation and purification of the enzyme takes place in six 

stages, each of which is carried out at a temperature below room temperature, 
preferably about 4°C. In the first stage or step, the cells, if frozen, are thawed, 
disintegrated with an Aminco French pressure cell (18,000 psi), suspended in a buffer 
at about pH 7.5, and centrifuged. In the second stage, the supernatant is collected and 

35 then fractionated by adding a salt such as dry ammonium sulfate and Polymin P to 
remove nucleic acids. The pellet (at 0.2 M NRtSC^) is discarded. 
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The supernatant from the second stage is applied to a phenyl sepharose column 
equilibrated with a buffer composed of 0.2 M (NEUhSO* 50 mM Tris-HCl, pH 7.5, 
and 0.5 mM DTT. Then the column is washed first with buffer (1 ): TE buffer 
containing 0.5 mM DTT and 0.2 M (NH^SO* then with buffer (2): TE buffer 
5 containing 0.5 mM DTT, then with buffer (3): buffer (2) containing 20% ethylene 
glycol. The protein is eluted in buffer (4): buffer (3) buffer containing 2 M urea. 

In the fourth step, the eluate collected in the third step is applied to a heparin 
sepharose column equilibrated with 0.15 M KCL The column is then washed in the 
same buffer and the enzyme eluted with a linear gradient of a buffer such as 0. 15 M to 

10 0.75 KCL The activity peak is at 0.3 1 to 0.355 M KCL 

In the fifth stage, the fraction collected in the fourth step is concentrated and 
diafiltered against Affigel-blue buffer. The precipitate formed is removed by 
centrifugation, and the supernatant is applied to an Affigel-blue column equilibrated 
with 0. 1 M KCL The column is then washed with 0. 1 M KC1 and the enzyme eluted 

15 with a linear gradient of a buffer such as 0. 1 to 0.5 M KCL Fractions with 

thermostable enzyme activity are then tested for contaminating deoxyribonucleases 
(endo- and exonucleases) using any suitable procedure. For example, the endonuclease 
activity may be determined electrophoretically from the change in molecular weight of 
phage X DNA or supercoiled plasmid DNA after incubation with an excess of DNA 

20 polymerase. Similarly, exonuclease activity may be determined electrophoretically 
from the change in molecular weight of DNA after treatment with a restriction enzyme 
that cleaves at several sites. The fractions determined to have no deoxyribonuclease 
activity (peak activity of polymerase elutes at 0.28 to 0.455 M KC1) are pooled and 
dialyzed against CM-Trisacryl buffer. The precipitate formed is removed by 

25 centrifugation. 

In the sixth step, the supernatant is applied to a CM-Trisacryl column 
equilibrated with 50 mM NaCL The column is washed with 50 mM NaCl and the 
enzyme eluted with a linear gradient of a buffer such as 0.05 to 0.4 M NaCL The 
pooled fractions having thermostable polymerase activity and no deoxyribonuclease 

30 activity elute at 0.16 to 0.20 M NaCL 

The molecular weight of the dialyzed product may be determined by any 
technique, for example, by SDS-PAGE analysis using protein molecular weight 
markers. The molecular weight of the DNA polymerase purified from Thermyg 
thermophilus is determined by the above method to be about 94 kDa. The molecular 

35 weight of this same DNA polymerase as determined by the predicted amino acid 

sequence is calculated to be approximately 94,016 daltons. The purification protocol of 
native Tth DNA polymerase is described in detail in Example 1. Purification of the 
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recombinant Tth polymerase of the invention can be carried out with similar 
methodology. 

An important aspect of the present invention is the production of recombinant 
Tth DNA polymerase. As noted above, the gene encoding this enzyme has been cloned 
5 from Thermus thermophilus genomic DNA. The complete coding sequence (-2.5 kb) 
for the Tth polymerase can be easily obtained in an ~3.7 kilobase (kb) Hind QI-BstEII 
restriction fragment of pIasmidpBSM:TthlO, although this -3.7 kb fragment contains 
an internal HindE restriction enzyme recognition site. This plasmid was deposited 
with the American Type Culture Collection (ATCC) in host cell E. coli K12 strain 

10 DG101 on December 21, 1989, under accession No. 68195. 

The complete coding sequence and deduced amino acid sequence of the 
thermostable Tth DNA polymerase enzyme is provided above* The entire coding 
sequence of the Tth DNA polymerase gene is not required, however, to produce a 
biologically active gene product with DNA polymerase activity. The availability of 

15 DNA encoding the Tth DNA polymerase sequence provides the opportunity to modify 
the coding sequence so as to generate mutein (mutant protein) forms also having DNA 
polymerase activity. Amino(N)-terminal deletions of the protein, up to about one-third 
of the protein, are not believed to destroy polymerase activity of the remaining 
fragment, and recombinant truncated proteins, created by deleting approximately one- 

20 tenth of the coding sequence (for the ammo-terminus), are quite active in polymerase 
assays. Because certain N-terminal shortened forms of the polymerase are active, the 
gene constructs used for expression of these polymerases can include the 
corresponding shortened forms of the coding sequence. 

In addition to the N-terminal deletions, individual amino acid residues in the 

25 Peptide chain comprising Tth polymerase may be modified by oxidation, reduction, or 
other derivation, and the protein may be cleaved to obtain fragments that retain activity. 
Such alterations that do not destroy activity do not remove the protein from the 
definition of a protein with Tth polymerase activity and so are specifically included 
within the scope of the present invention. Modifications to the primary structure of the 

30 Tth gene DNA polymerase by deletion, addition, or alteration so as to change the amino 
acids incorporated into the Tth DNA polymerase during translation can be made without 
destroying the high temperature DNA polymerase activity of the protein. Such 
substitutions or other alternations result in the production of proteins having an amino 
acid sequence encoded by DNA falling within the contemplated scope of the present 

35 invention. Likewise, the cloned genomic sequence, or homologous synthetic 
sequences, of the Tth DNA polymerase gene can be used to express a fusion 
polypeptide with Tth DNA polymerase activity or to express a protein with an amino 
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acid sequence identical to that of native Tth DNA polymerase. In addition, such 
expression can be directed by the Tth DNA polymerase gene control sequences or by a 
control sequence that functions in whatever host is chosen to express the Tth DNA 
polymerase. 

5 Thus, the present invention provides the complete coding sequence for Tth 

DNA polymerase from which expression vectors applicable to a variety of host systems 
can be constructed and the coding sequence expressed. Portions of the Tth polymerase- 
encoding sequence are also useful as probes to retrieve other thermostable polymerase- 
encoding sequences in a variety of species. Accordingly, portions of the genomic DNA 

10 encoding at least four to six amino acids can be replicated in E. £oJi and the denatured 
forms used as probes or oligodeoxyribonucleotide probes that encode at least four to 
six amino acids can be synthesized and used to retrieve additional DNAs encoding a 
thermostable polymerase. Because there may not be an exact match between the 
nucleotide sequence of the thermostable DNA polymerase gene of Thermus 

15 thermophilus and the corresponding gene of other species, oligomers containing 
approximately 12-18 nucleotides (encoding the four to six amino acid sequence) are 
usually necessary to obtain hybridization under conditions of sufficient stringency to 
eliminate false positives. Sequences encoding six amino acids supply ample 
information for such probes. 

20 The present invention, by providing the coding and amino acid sequences for 

Tth DNA polymerase, therefore enables the isolation of other thermostable polymerase 
enzymes and the coding sequences for those enzymes. The Taq and Tth DNA 
polymerase coding sequences are very similar, and this similarity facilitated the 
identification and isolation of the Tth DNA polymerase coding sequence. The regions 

25 of dissimilarity between the Taq and Tth DNA polymerase coding sequences can also 
be used as probes, however, to identify other thermostable polymerase coding 
sequences that encode enzymes quite divergent from, for example, Taq polymerase but 
similar to Tth polymerase. 

Several such regions of dissimilarity between the Taq and Tth DNA polymerase 

30 coding sequences exist. These regions include the sequences for codons 225-230; 238- 
246; 241-249; 335-343; 336-344; 337-345; 338-346; and 339-347. For regions nine 
codons in length, probes corresponding to these regions can be used to identify and 
isolate thermostable polymerase encoding DNA sequences that are identical (and 
complementary) to the probe for a contiguous sequence of at least five codons. For the 

35 region six codons in length, a probe corresponding to this region can be used to 

identify and isolate thermostable polymerase-encoding DNA sequences that are identical 
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to the probe for a contiguous sequence of at least four codons. Such thermostable 
polymerase-encoding DNA sequences need not be from a 

Thermus thermophilus species, or even from the genus Thermus . to be isolated, so 
, long as the requisite homology is present 

5 Whether one desires to produce an enzyme identical to native Tth DNA 

polymerase or a derivative or homologue of that enzyme, the production of a 
recombinant form of Tth polymerase typically involves the construction of an 
expression vector, the transformation of a host cell with the vector, and culture of the 
transformed host cell under conditions such that expression will occur. To construct 

10 the expression vector, a DNA is obtained that encodes the mature (used here to include 
all muteins) enzyme or a fusion of the Tth polymerase to an additional sequence that 
does not destroy activity or to an additional sequence cleavable under controlled 
conditions (such as treatment with peptidase) to give an active protein. The coding 
sequence is then placed in operable linkage with suitable control sequences in an 

15 expression vector. The vector can be designed to replicate autonomously in the host 
cell or to integrate into the chromosomal DNA of the host cell. The vector is used to 
transform a suitable host, and the transformed host is cultured under conditions suitable 
for expression of recombinant Tth polymerase. The Tth polymerase is isolated from 
the medium or from the cells; recovery and purification of the protein may not be 

2o necessary in some instances, where some impurities may be tolerated. 

Each of the foregoing steps can be done in a variety of ways. For example the 
desired coding sequence may be obtained from genomic fragments and used directly in 
appropriate hosts. The construction for expression vectors operable in a variety of 
hosts are made using appropriate replicons and control sequences, as set forth generally 

25 below. Construction of suitable vectors containing the desired coding and control 

sequences employs standard ligation and restriction techniques that are well understood 
in the art. Isolated plasmids, DNA sequences, or synthesized oligonucleotides are 
cleaved, modified, and religated in the form desired. Suitable restriction sites can, if 
not normally available, be added to the ends of the coding sequence so as to facilitate 

30 construction of an expression vector, as exemplified below. 

Site-specific DNA cleavage is performed by treating with the suitable restriction 
enzyme (or enzymes) under conditions that are generally understood in the art and 
specified by the manufacturers of commercially available restriction enzymes. See, 
e.g*, New England Biolabs, Product Catalog. In general, about 1 jig of plasmid or 

35 other DNA is cleaved by one unit of enzyme in about 20 |Lil of buffer solution; in the 
examples below, an excess of restriction enzyme is generally used to ensure complete 
digestion of the DNA. Incubation times of about one to two hours at about 37°C are 
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typical, although variations can be tolerated. After each incubation, protein is removed 
by extraction with phenol and chloroform; this extraction can be followed by ether 
extraction and recovery of the DNA from aqueous fractions by precipitation with 
ethanol. If desired, size separation of the cleaved fragments may be performed by 
5 polyacrylamide gel or agarose gel electrophoresis using standard techniques. See, e.g., 
Methods in Enzvmologv , 1980, 65:499-560. 

Restriction-cleaved fragments with single-strand "overhanging" termini can be 
made blunt-ended (double-strand ends) by treating with the large fragment of E. goH 
DNA polymerase I (Klenow) in the presence of the four deoxynucleoside triphosphates 

10 (dNTPs) using incubation times of about 15 to 25 minutes at 20 to 25°C in 50 mM Tris 
pH 7.6, 50 mM NaCl, 10 mM MgCl 2 , 10 mM DTT and 5 to 10 |iM dNTPs. The 
Klenow fragment fills in at 5' protruding ends, but chews back protruding 3' single 
strands, even though the four dNTPs are present. If desired, selective repair can be 
performed by supplying only one of the, or selected, dNTPs within the limitations 

15 dictated by the nature of the protruding ends. After treatment with Klenow, the mixture 
is extracted with phenol/chloroform and ethanol precipitated. Similar results can be 
achieved using SI nuclease, because treatment under appropriate conditions with SI 
nuclease results in hydrolysis of any single-stranded portion of a nucleic acid. 
Synthetic oligonucleotides can be prepared using the triester method of 

20 Matteucci£t£l, 1981, ! Am. Chem . Soc . JjQ2:3185-3191 or automated synthesis 

methods. Kinasing of single strands prior to annealing or for labeling is achieved using 
an excess, e.g., approximately 10 units, of polynucleotide kinase to 0.5 |iM substrate 
in the presence of 50 mM Tris, pH 7.6, 10 mM MgCl 2 , 5 mM dithiothreitol (DTT), and 
1 to 2 jjM ATP. If kinasing is for labeling of probe, the ATP will contain high specific 

25 activity Y-32P. 

Ligations are performed in 15-30 |Lil volumes under the following standard 
conditions and temperatures: 20 mM Tris-Cl, pH 7.5, 10 mM MgCl 2 , 10 mM DTT, 33 
^ig/ml BSA, 10 mM-50 mM NaCl, and either 40 jiM ATP and 0.01-0.02 (Weiss) units 
T4 DNA ligase at 0°C (for ligation of fragments with complementary single-stranded 

30 ends) or 1 mM ATP and 0.3-0.6 units T4 DNA ligase at 14°C (for "blunt end" 

ligation). Intermolecular ligations of fragments with complementary ends are usually 
performed at 33-100 (ag/ml total DNA concentrations (5-100 nM total ends 
concentration). Intermolecular blunt end ligations (usually employing a 10-30 fold 
molar excess of linkers) are performed at 1 |iM total ends concentration. 

35 In vector construction, the vector fragment is commonly treated with bacterial or 

calf intestinal alkaline phosphatase (BAP or CIAP) to remove the 5' phosphate and 
prevent religation and reconstruction of the vector. BAP and CLAP digestion 
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conditions are well known in the art, and published protocols usually accompany the 
commercially available BAP and CIAP enzymes. To recover the nucleic acid 
fragments, the preparation is extracted with phenol-chloroform and ethanol precipitated 
to remove AP and purify the DNA. Alternatively, religation can be prevented by 
5 restriction enzyme digestion of unwanted vector fragments before or after ligation of the 
desired vector. 

For portions of vectors or coding sequences that require sequence 
modifications, a variety site-specific primer-directed mutagenesis methods are available. 
The polymerase chain reaction (PCR) can be used to perform site-specific mutagenesis. 

10 In another technique now standard in the art, a synthetic oligonucleotide encoding the 
desired mutation is used as a primer to direct synthesis of a complementary nucleic acid 
sequence of a single-stranded vector, such pBS 13+ that serves as a template for 
construction of the extension product of the mutagenizing primer. The mutagenized 
DNA is transformed into a host bacterium, and cultures of the transformed bacteria are 

15 plated and identified. The identification of modified vectors may involve transfer of the 
DNA of selected transformants to a nitrocellulose filter or other membrane and the 
"lifts" hybridized with kinased synthetic primer at a temperature that permits 
hybridization of an exact match to the modified sequence but prevents hybridization 
with the original strand. Transformants that contain DNA that hybridizes with the 

2o probe are then cultured and serve as a reservoir of the modified DNA. 

In the constructions set forth below, correct ligations for plasmid construction 
are confirmed by first transforming E. coli strain DG101 or another suitable host, with 
the ligation mixture. Successful transformants are selected by ampicillin, tetracycline or 
other antibiotic resistance or sensitivity or by using other markers, depending on the 

25 mode of plasmid construction, as is understood in the art. Plasmids from the 

transformants are then prepared according to the method of Clewell £l aL, 1969, Proc. 
Natl . Acad. U§A £2:1 159, optionally following chloramphenicol amplification 
(Clewell, 1972, I. BacterioL li£:667). Another method for obtaining plasmid DNA is 
described as the "Base-Acid" extraction method at page 11 of the Bethesda Research 

3 Q Laboratories publication Focus , volume 5, number 2, and very pure plasmid DNA can 
be obtained by replacing steps 12 through 17 of the protocol with CsCl/ethidium 
bromide ultracentrifugation of the DNA. The isolated DNA is analyzed by restriction 
enzyme digestion and/or sequenced by the dideoxy method of Sanger £t aL, 1977, 
Proc. Natl . Acad. Sci. USA 24:5463, as further described by Messing et aL, 198 1 , 

35 Nuc . Acids Res . 2:309, or by the method of Maxam sL fiU 1980, Methods m 
Enzvmologv 65:499. 
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The control sequences, expression vectors, and transformation methods are 
dependent on the type of host cell used to express the gene. Generally, procaryotic, 
yeast, insect, or mammalian cells are used as hosts. Procaryotic hosts are in general the 
most efficient and convenient for the production of recombinant proteins and are 
5 therefore preferred for the expression of Tth polymerase. 

The procaryote most frequently used to express recombinant proteins is E. coli. 
For cloning and sequencing, and for expression of constructions under control of most 
bacterial promoters, E. QQ& K12 strain MM294, obtained from the E. £Qli Genetic 
Stock Center under GCSC #6135, can be used as the host. For expression vectors 
10 with the PlNrbs control sequence, E. qqH K12 strain MC1000 lambda lysogen, 
N 7 N 5 3cI857 SusPgo, ATCC 3953 1, may be used. £. qqH DG1 16, which was 
deposited with the ATCC (ATCC 53606) on April 7, 1987, and E. Mi KB2, which 
was deposited with the ATCC (ATCC 53075) on March 29, 1985, are also useful host 
cells. For Ml 3 phage recombinants, E. Mi strains susceptible to phage infection, such 
15 as E. Mi K12 strain DG98, are employed. The DG98 strain was deposited with the 
ATCC (ATCC 39768) on July 13, 1984. 

However, microbial strains other than E. Mi can also be used, such as bacilli, 
for example Bacillus subtilis . various species of Pseudomonas, and other bacterial 
strains, for recombinant expression of Tth DNA polymerase. In such procaryotic 
20 systems, plasmid vectors that contain replication sites and control sequences derived 
from the host or a species compatible with the host are typically used. 

For example, E. coli is typically transformed using derivatives of pBR322, 
described by Bolivar & at., 1977, Gene 2:95. Plasmid pBR322 contains genes for 
ampicillin and tetracycline resistance. These drug resistance markers can be either 
25 retained or destroyed in constructing the desired vector and so help to detect the 

presence of a desired recombinant. Commonly used procaryotic control sequences, 
i.e., a promoter for transcription initiation, optionally with an operator, along with a 
ribosome binding site sequence, include the p-lactamase (penicillinase) and lactose (lac) 
promoter systems (Chang stal., 1977, MammiSS: 1056), the tryptophan (trp) 
30 promoter system (Goeddel £t al, 1980, Hue. Acids Res. 5:4057), and the lambda- 
derived Pl promoter (Shimatake £t 1981, Nature 222:128) and N-gene ribosome 
binding site (Nrbs)- A portable control system cassette is set forth in U.S. Patent No. 
4,7 11,845, issued December 8, 1987. This cassette comprises a P L promoter operably 
linked to the Nrbs in turn positioned upstream of a third DNA sequence having at least 
35 one restriction site that permits cleavage within six bp 3' of the Nrbs sequence. Also 
useful is the phosphatase A (phoA) system described by Chang £| al. in European 
Patent Publication No. 196,864, published October 8, 1986. However, any available 
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promoter system compatible with procaryotes can be used to construct a Tth expression 
vector of the invention. 

In addition to bacteria, eucaryotic microbes, such as yeast, can also be used as 
recombinant host cells. Laboratory strains of Saccharomyces cerevisiae . Baker's yeast, 
5 are most often used, although a number of other strains are commonly available. While 
vectors employing the two micron origin of replication are common (Broach, 1983, 
Meth . Enz. 1QI :307), other plasmid vectors suitable for yeast expression are known 
(see, for example, Stinchcomb ££aL, 1979, Nature 282 :39: Tschempe ££ 1980, 
Gene 1Q:157; and Clarke si 1983, Meth . Enz . 101:300). Control sequences for 

10 yeast vectors include promoters for the synthesis of glycolytic enzymes (Hess £t ^1. 5 
1968, ! Adv. Enzvme Rgg, 2:149, and Holland £t al., 1978, Biotechnology 17:4900V 
Additional promoters known in the art include the promoter for 3-phosphoglycerate 
kinase (Hitzeman ££ aL, 1980, L BioL Chem. 255:2073) and those for other glycolytic 
enzymes, such as glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate 

15 decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- 
phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, 
phosphoglucose isomerase, and glucokinase. Other promoters that have the additional 
advantage of transcription controlled by growth conditions are the promoter regions for 
alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes 

20 associated with nitrogen metabolism, and enzymes responsible for maltose and 
galactose utilization (Holland, supral 

Terminator sequences may also be used to enhance expression when placed at 
the 3 r end of the coding sequence. Such terminators are found in the 3' untranslated 
region following the coding sequences in yeast-derived genes. Many vectors contain 

25 control sequences derived from the enolase gene contained in plasmid peno46 (Holland 
£1 aL, 1981, 1 BiqI. Chem . 25&1385) or the LEU2 gene obtained from YEpl3 
(Broach ££ 1978, Gene 8:12D: however, any vector containing a yeast-compatible 
promoter, origin of replication, and other control sequences is suitable for use in 
constructing yeast Tth expression vectors. 

30 The Tth gene can also be expressed in eucaryotic host cell cultures derived from 

multicellular organisms. See, for example, Tissue Culture . Academic Press, Cruz and 
Patterson, editors (1973). Useful host cell lines include COS-7, COS-A2, CV-1, 
murine cells such as murine myelomas N51 and VERO, HeLa cells, and Chinese 
hamster ovary (CHO) cells. Expression vectors for such cells ordinarily include 

35 promoters and control sequences compatible with mammalian cells such as, for 

example, the commonly used early and late promoters from Simian Virus 40 (S V 40) 
(Fiers ££ al, 1978, Nature 222: 1 13), or other viral promoters such as those derived 
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from polyoma, adenovirus 2, bovine papilloma virus (BPV), or avian sarcoma viruses, 
or immunoglobulin promoters and heat shock promoters. A system for expressing 
DNA in mammalian systems using a BPV vector system is disclosed in U.S. Patent 
No. 4,419,446. A modification of this system is described in U.S. Patent No. 
5 4,601,978. General aspects of mammalian cell host system transformations have been 
described by Axel, U.S. Patent No. 4,399,216. "Enhancer" regions are also important 
in optimizing expression; these are, generally, sequences found upstream of the 
promoter region. Origins of replication may be obtained, if needed, from viral sources. 
However, integration into the chromosome is a common mechanism for DNA 
10 replication in eucaryotes. 

Plant cells can also be used as hosts, and control sequences compatible with 
plant cells, such as the nopaline synthase promoter and polyadenylation signal 
sequences (Depicker 1982, 1. Mol . Appl . Gen . 1:561) are available. Expression 
systems employing insect cells utilizing the control systems provided by baculovirus 
15 vectors have also been described (Miller £l aL in Genetic En gineering (1986) Setlow et 
3I.5 eds., Plenum Publishing, Vol. 8, pp. 277-297). Insect cell-based expression can 
be accomplished in Spodoptera frugipeida. These systems are also successful in 
producing recombinant Tth polymerase. 

Depending on the host cell used, transformation is done using standard 
20 techniques appropriate to such cells. The calcium treatment employing calcium 

chloride, as described by Cohen, 1972. Proc . Natl . Acad . Sci. USA 69:21 10 is used 
for procaryotes or other cells that contain substantial cell wall barriers. Infection with 
A grobacterium tumefaciens (Shaw si sb* 1983, Gene 23:3 15) is used for certain plant 
cells. For mammalian cells, the calcium phosphate precipitation method of Graham and 
25 van der Eb, 1978, Virology 52 :546 is preferred. Transformations into yeast are carried 
out according to the method of Van Solingen sL Mm 1977, £ Bact . 130:946 and Hsiao 
ft al„ 1979, Proc. Natl. Acad . Sci. USA 76:3829. 

Once the Tth DNA polymerase has been expressed in a recombinant host cell, 
purification of the protein may be desired. Although the purification procedures 
30 previously described can be used to purify the recombinant thermostable polymerase of 
the invention, hydrophobic interaction chromatography purification methods are 
preferred. Hydrophobic interaction chromatography is a separation technique in which 
substances are separated on the basis of differing strengths of hydrophobic interaction 
with an uncharged bed material containing hydrophobic groups. Typically, the column 
35 is first equilibrated under conditions favorable to hydrophobic binding, e.g., high ionic 
strength. A descending salt gradient may be used to elute the sample. 
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According to the invention, the aqueous mixture (containing either native or 
recombinant Tth DNA polymerase) is loaded onto a column containing a relatively 
strong hydrophobic gel such as phenyl sepharose (manufactured by Pharmacia) or 
Phenyl TSK (manufactured by Toyo Soda). To promote hydrophobic interaction with 
5 a phenyl sepharose column, a solvent is used which contains, for example, greater than 
or equal to 0.2 M ammonium sulfate, with 0.2 M being preferred. The column and the 
sample are adjusted to 0.2 M ammonium sulfate in 50 mM Tris, pH 7.5, and 1 mM 
EDTA ("TE") buffer that also contains 1 mM DTT and the sample applied to the 
column. The column is washed with the 0.2 M ammonium sulfate buffer. The enzyme 

10 may then be eluted with solvents which attenuate hydrophobic interactions such as, for 
example, decreasing salt gradients, ethylene or propylene glycol, or urea. For 
recombinant Tth polymerase, a preferred embodiment involves washing the column 
sequentially with the Tris-EDTA buffer and the Tris-EDTA buffer containing 20% 
ethylene glycol. The Tth polymerase is subsequently eluted from the column with a 0 

15 to 4 M urea gradient in the Tris-EDTA ethylene glycol buffer. 

For long-term stability, Tth DNA polymerase enzyme is stored in a buffer that 
contains one or more non-ionic polymeric detergents. Such detergents are generally 
those that have a molecular weight in the range of approximately 100 to 250,000 
preferably about 4,000 to 200,000 daltons and stabilize the enzyme at a pH of from 

2Q about 3.5 to about 9.5, preferably from about 4 to 8.5. Examples of such detergents 
include those specified on pages 295-298 of McCutcheon ! s Emulsifiers & Detergents . 
North American edition (1983), published by the McCutcheon Division of MC 
Publishing Co., 175 Rock Road, Glen Rock, NJ (US A), the entire disclosure of which 
is incorporated herein by reference. Preferably, the detergents are selected from the 

2 5 group comprising ethoxylated fatty alcohol ethers and lauryl ethers, ethoxylated alkyl 
phenols, octylphenoxy polyethoxy ethanol compounds, modified oxyethylated and/or 
oxypropylated straight-chain alcohols, polyethylene glycol monooleate compounds, 
polysorbate compounds, and phenolic fatty alcohol ethers. More particularly preferred 
are Tween 20, a polyoxyethylated (20) sorbitan monolaurate from ICI Americas Inc., 

30 Wilmington, D.E., and Iconol™ NP-40, an ethoxylated alkyl phenol (nonyl) from 
BASF Wyandotte Corp. Parsippany, NJ. 

The thermostable enzyme of this invention may be used for any purpose in 
which such enzyme activity is necessary or desired. In a particularly preferred 
embodiment, the enzyme catalyzes the nucleic acid amplification reaction known as 

35 PCR. This process for amplifying nucleic acid sequences is disclosed and claimed in 
U.S. Patent No. 4,683,202, issued July 28, 1987, the disclosure of which is 
incorporated herein by reference. The PCR nucleic acid amplification method involves 
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amplifying at least one specific nucleic acid sequence contained in a nucleic acid or a 
mixture of nucleic acids and produces double-stranded DNA. 

For ease of discussion, the protocol set forth below assumes that the specific 
sequence to be amplified is contained in a double-stranded nucleic acid. However, the 
5 process is equally useful in amplifying single-stranded nucleic acid, such as mRNA, 
although in the preferred embodiment the ultimate product is still double-stranded 
DNA, In the amplification of a single-stranded nucleic acid, the first step involves the 
synthesis of a complementary strand (one of the two amplification primers can be used 
for this purpose), and the succeeding steps proceed as in the double-stranded 
10 amplification process described below. 

This amplification process comprises the steps of: 
(a) contacting each nucleic acid strand with four different nucleoside 
triphosphates and one oligonucleotide primer for each strand of the specific sequence 
being amplified, wherein each primer is selected to be substantially complementary to 
15 the different strands of the specific sequence, such that the extension product 

synthesized from one primer, when it is separated from its complement, can serve as a 
template for synthesis of the extension product of the other primer, said contacting 
being at a temperature which allows hybridization of each primer to a complementary 
nucleic acid strand; 

2o (b) contacting each nucleic acid strand; at the same time as or after step (a), 

with a DNA polymerase from Thermus thermophilus which enables combination of the 
nucleoside triphosphates to form primer extension products complementary to each 
strand of the specific nucleic acid sequence; 

(c) maintaining the mixture from step (b) at an effective temperature for an 
25 effective time to promote the activity of the enzyme and to synthesize, for each different 

sequence being amplified, an extension product of each primer which is complementary 
to each nucleic acid strand template, but not so high as to separate each extension 
product from the complementary strand template; 

(d) heating the mixture from step (c) for an effective time and at an effective 
30 temperature to separate the primer extension products from the templates on which they 

were synthesized to produce single-stranded molecules but not so high as to denature 
irreversibly the enzyme; 

(e) cooling the mixture from step (d) for an effective time and to an effective 
temperature to promote hybridization of a primer to each of the single-stranded 

35 molecules produced in step (d); and 

(f) maintaining the mixture from step (e) at an effective temperature for an 
effective time to promote the activity of the enzyme and to synthesize, for each different 
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sequence being amplified, an extension product of each primer which is complementary 
to each nucleic acid strand template produced in step (d) but not so high as to separate 
each extension product from the complementary strand template. The effective times 
and temperatures in steps (e) and (f) may coincide, so that steps (e) and (f) can be 
5 carried out simultaneously. Steps (d)-(f) are repeated until the desired level of 
amplification is obtained. 

The amplification method is useful not only for producing large amounts of a 
specific nucleic acid sequence of known sequence but also for producing nucleic acid 
sequences which are known to exist but are not completely specified. One need know 

10 only a sufficient number of bases at both ends of the sequence in sufficient detail so that 
two oligonucleotide primers can be prepared which will hybridize to different strands of 
the desired sequence at relative positions along the sequence such that an extension 
product synthesized from one primer, when separated from the template (complement), 
can serve as a template for extension of the other primer into a nucleic acid sequence of 

15 defined length. The greater the knowledge about the bases at both ends of the 

sequence, the greater can be the specificity of the primers for the target nucleic acid 
sequence and the efficiency of the process. In any case, an initial copy of the sequence 
to be amplified must be available, although the sequence need not be pure or a discrete 
molecule. In general, the amplification process involves a chain reaction for producing, 

20 in exponential quantities relative to the number of reaction steps involved, at least one 
specific nucleic acid sequence given that (a) the ends of the required sequence are 
known in sufficient detail that oligonucleotides can be synthesized which will hybridize 
to them, and (b) that a small amount of the sequence is available to initiate the chain 
reaction. The product of the chain reaction will be a discrete nucleic acid duplex with 

25 termini corresponding to the ends of the specific primers employed. 

Any nucleic acid sequence, in purified or nonpurified form, can be utilized as 
the starting nucleic acid(s), provided it contains or is suspected to contain the specific 
nucleic acid sequence desired. The nucleic acid to be amplified can be obtained from 
any source, for example, from plasmids such as pBR322, from cloned DNA or RNA, 

3 q or from natural DNA or RNA from any source, including bacteria, yeast, viruses, 
organelles, and higher organisms such as plants and animals. DNA or RNA may be 
extracted from blood, tissue material such as chorionic villi, or amniotic cells by a 
variety of techniques. See, e.g., Maniatis £t sl.» supra , pp. 280-281. Thus, the 
process may employ, for example, DNA or RNA, including messenger RNA, which 

35 DNA or RNA may be single-stranded or double-stranded. In addition, a DNA-RNA 
hybrid which contains one strand of each may be utilized. A mixture of any of these 
nucleic acids can also be employed as can nucleic acids produced from a previous 
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amplification reaction (using the same or different primers). The specific nucleic acid 
sequence to be amplified may be only a fraction of a large molecule or can be present 
initially as a discrete molecule, so that the specific sequence constitutes the entire 
nucleic acid. 

5 The sequence to be amplified need not be present initially in a pure form; the 

sequence can be a minor fraction of a complex mixture, such as a portion of the p- 
globin gene contained in whole human DNA (as exemplified in SaiM £t aL, 1985, 
Science 230:1530-1534) or a portion of a nucleic acid sequence due to a particular 
microorganism, which organism might constitute only a very minor fraction of a 

10 particular biological sample. The cells can be directly used in the amplification process 
after suspension in hypotonic buffer and heat treatment at about 90- 100° C until cell 
lysis and dispersion of intracellular components occur (generally 1 to 15 minutes). 
After the heating step, the amplification reagents may be added directly to the ly sed 
cells. The starting nucleic acid sequence may contain more than one desired specific 

15 nucleic acid sequence. The amplification process is useful not only for producing large 
amounts of one specific nucleic acid sequence but also for amplifying simultaneously 
more than one different specific nucleic acid sequence located on the same or different 
nucleic acid molecules. 

Primers play a key role in the PCR process. The word "primer" as used in 

20 describing the amplification process can refer to more than one primer, particulary in 
the ease where there is some ambiguity in the information regarding the terminal 
sequence(s) of the fragment to be amplified. For instance, in the case where a nucleic 
acid sequence is inferred from protein sequence information, a collection of primers 
containing sequences representing all possible codon variations based on degeneracy of 

25 the genetic code will be used for each strand. One primer from this collection will be 
sufficiently homologous with the end of the desired sequence to be amplified to be 
useful for amplification. 

In addition, more than one specific nucleic acid sequence can be amplified from 
the first nucleic acid or mixture of nucleic acids, so long as the appropriate number of 

30 different oligonucleotide primers are utilized. For example, if two different specific 
nucleic acid sequences are to be produced, four primers are utilized. Two of the 
primers are specific for one of the specific nucleic acid sequences and the other two 
primers are specific for the second specific nucleic acid sequence. In this manner, each 
of the two different specific sequences can be produced exponentially by the present 

35 process. 

A sequence within a given sequence can be amplified after a given number of 
cycles to obtain greater specificity of the reaction by adding after at least one cycle of 
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amplification a set of primers that are complementary to internal sequences (that are not 
at the ends) of the sequence to be amplified. Such primers may be added at any stage 
and will provide a shorter amplified fragment. Alternatively, a longer fragment can be 
prepared by using primers with non-complementary 5-ends but having some overlap 
5 with the primers previously utilized in the amplification. 

Primers also play a key role when the amplification process is used for in vitro 
mutagenesis. The product of an amplification reaction where the primers employed are 
not exactly complementary to the original template will contain the sequence of the 
primer rather than the template, so introducing an in vitro mutation^ In further cycles 

10 this mutation will be amplified with an undiminished efficiency because no further 
mispaired priming is required. The process of making an altered DNA sequence as 
described above could be repeated on the altered DNA using different primers to induce 
further sequence changes. In this way, a series of mutated sequences can gradually be 
produced wherein each new addition to the series differs from the last in a minor way, 

15 but from the original DNA source sequence in an increasingly major way. 

Because the primer can contain as part of its sequence a non-complementary 
sequence, provided that a sufficient amount of the primer contains a sequence that is 
complementary to the strand to be amplified, many other advantages can be realized. 
For example, a nucleotide sequence that is not complementary to the template sequence 

20 (such as, e.g., a promoter, linker, coding sequence, etc.) may be attached at the 5' end 
of one or both of the primers and so appended to the product of the amplification 
process. After the extension primer is added, sufficient cycles are run to achieve the 
desired amount of new template containing the non-complementary nucleotide insert. 
This allows production of large quantities of the combined fragments in a relatively 

25 short period of time (e.g., two hours or less) using a simple technique. 

Oligonucleotide primers can be prepared using any suitable method, such as, 
for example, the phosphotriester and phosphodiester methods described above, or 
automated embodiments thereof. In one such automated embodiment, 
diethylphosphoramidites are used as starting materials and may be synthesized as 

30 described by Beaucage s£ 1981. Tetrahedron letters 22: 1 fKfD Onemethod 
for synthesizing oligonucleotides on a modified solid support is described in U.S. 
Patent No. 4,458,066. One can also use a primer that has been isolated from a 
biological source (such as a restriction endonuclease digest). 

No matter what primers are used, however, the reaction mixture must contain a 

35 template for PCR to occur, because the specific nucleic acid sequence is produced by 
using a nucleic acid containing that sequence as a template. The first step involves 
contacting each nucleic acid strand with four different nucleoside triphosphates and one 
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oligonucleotide primer for each strand of each spedfic nucleic add sequence being 
amplified or detected. If the nucleic acids to be amplified or detected are DNA, then the 
nucleoside triphosphates are usually dATP, dCTP, dGTP, and TIP, although various 
nucleotide derivatives can also be used in the process. The concentration of nucleotide 
5 triphosphates can vary widely. Typically the concentration is 50-200 |iM in each dNTP 
in the buffer for amplification, and MgCl'2 is present in the buffer in an amount of 1 to 
3 mM to increase the efficiency and specificity of the reaction. However, dNTP 
concentrations of 1-20 jiM may be preferred for some applications, such as DNA 
sequencing. 

10 The nucleic acids strands of the target nucleic acid serve as templates for the 

synthesis of additional nucleic acids strands, which are extension products of the 
primers. This synthesis can be performed using any suitable method, but generally 
occurs in a buffered aqueous solution, preferably at a pH of 7-9, most preferably about 
8. To facilitate synthesis, a molar excess (for cloned nucleic acid, usually about 1000:1 
15 primerrtemplate and for genomic nucleic acid, usually about 10 8 : 1 primentemplate) of 
the two oligonucleotide primers is added to the buffer containing the template strands. 
As a practical matter, the amount of primer added will generally be in molar excess over 
the amount of complementary strand (template) when the sequence to be amplified is 
contained in a mixture of complicated long-chain nucleic acid strands. A large molar 
20 excess is preferred to improve the efficiency of the process. 

The mixture of template, primers, and nucleoside triphosphates is then treated 
according to whether the nucleic acids being amplified or detected are double- or single- 
stranded. If the nucleic acids are single-stranded, then no denaturation step need be 
employed, and the reaction mixture is held at a temperature which promotes 
25 hybridization of the primer to its complementary target (template) sequence. S uch 
temperature is generally from about 35°C to 65°C or more, preferably about 37-60° C 
for an effective time, generally from a few seconds to five minutes, preferably from 30 
seconds to one minute. A hybridization temperature of 45-58°C is used for Tth DNA 
polymerase, and 15-mer or longer primers are used to increase the specificity of primer 
30 hybridization. Shorter primers require lower hybridization temperatures. The 

complement to the original single- stranded nucleic acids can be synthesized by adding 
Tth DNA polymerase in the presence of the appropriate buffer, dNTPs, and one or 
more oligonucleotide primers. If an appropriate single primer is added, the primer 
extension product will be complementary to the single-stranded nucleic acid and will be 
35 hybridized with the nucleic acid strand in a duplex of strands of equal or unequal length 
(depending on where the primer hybridizes to the template), which may then be 
separated into single strands as described above to produce two single, separated, 
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complementary strands. Alternatively, two or more appropriate primers (one of which 
will prime synthesis using the extension product of the other primer as a template) may 
be added to the single-stranded nucleic acid and the reaction carried out 

If the nucleic acid contains two strands, as in the case of amplification of a 
5 double-stranded target or second-cycle amplification of a single-stranded target, the 
strands of nucleic acid must be separated before the primers are hybridized. This strand 
separation can be accomplished by any suitable denaturing method, including physical, 
chemical or enzymatic means. One preferred physical method of separating the strands 
of the nucleic acid involves heating the nucleic acid until complete (>99%) denaturation 

1q occurs. Typical heat denaturation involves temperatures ranging from about 90 to 
105°C for times generally ranging from about a few seconds to 5 minutes, depending 
on the composition and size of the nucleic acid. Preferably, the effective denaturing 
temperature is 90-100°C for 10 seconds to 3 minutes. Strand separation may also be 
induced by an enzyme from the class of enzymes known as helicases or the enzyme 

15 RecA, which has helicase activity and in the presence of riboATP is known to denature 
DNA. The reaction conditions suitable for separating the strands of nucleic acids with 
helicases are described by Kuhn Hoffmann-Berling, 1978, CSH-Ouantitative Biology 
42:63, and techniques for using RecA are reviewed in Radding, 1982, Ann . Rev . 
Genetics 1^:405-437. The denaturation produces two separated complementary strands 

20 of equal or unequal length. 

If the double-stranded nucleic acid is denatured by heat, the reaction mixture is 
allowed to cool to a temperature which promotes hybridization of each primer to the 
complementary target (template) sequence. This temperature is usually from about 
35°C to 65°C or more, depending on reagents, preferably 37-60°C. The hybridization 

25 temperature is maintained for an effective time, generally 30 seconds to 5 minutes, and 
preferably 1-3 minutes. In practical terms, the temperature is simply lowered from 
about 95 P C to as low as 37°C, and hybridization occurs at a temperature within this 
range. 

Whether the nucleic acid is single- or double-stranded, the DNA polymerase 
30 from Thermus thermophilus may be added at the denaturation step or when the 
temperature is being reduced to or is in the range forpromoting hybridization. 
Although the thermostability of Tth polymerase allows one to add Tth polymerase to the 
reaction mixture at any time, one can substantially inhibit non-specific amplification by 
adding the polymerase to the reaction mixture at a point in time when the mixture will 
35 not be cooled below the stringent hybridization temperature. After hybridization, the 
reaction mixture is then heated to or maintained at a temperature at which the activity of 
the enzyme is promoted or optimized, i.e., a temperature sufficient to increase the 
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activity of the enzyme in facilitating synthesis of the primer extension products from the 
hybridized primer and template. The temperature must actually be sufficient to 
synthesize an extension product of each primer which is complementary to each nucleic 
acid template, but must not be so high as to denature each extension product from its 
5 complementary template (i.e., the temperature is generally less than about 80-90°C). 

Depending on the nucleic acid(s) employed, the typical temperature effective for 
this synthesis reaction generally ranges from about 40 to 80°C, preferably 50-75°C 
The temperature more preferably ranges from about 65-75°C for Thermus 
thermophilius DNA polymerase. The period of time required for this synthesis may 
10 range from about 0.5 to 40 minutes or more, depending mainly on the temperature, the 
length of the nucleic acid, the enzyme, and the complexity of the nucleic acid mixture. 
The extension time is usually about 30 seconds to three minutes. If the nucleic acid is 
longer, a longer time period is generally required for complementary strand synthesis. 
The newly synthesized strand and the complement nucleic acid strand form a 
15 double-stranded molecule which is used in the succeeding steps of the amplification 
process. In the next step, the strands of the double-stranded molecule are separated by 
heat denaturation at a temperature and for a time effective to denature the molecule, but 
not at a temperature and for a period so long that the thermostable enzyme is completely 
and irreversibly denatured or inactivated. After this denaturation of template, the 
20 temperature is decreased to a level which promotes hybridization of the primer to the 
complementary single-stranded molecule (template) produced from the previous step, 
as described above. 

After this hybridization step, or concurrently with the hybridization step, the 
temperature is adjusted to a temperature that is effective to promote the activity of the 
25 thermostable enzyme to enable synthesis of a primer extension product using as a 
template both the newly synthesized and the original strands. The temperature again 
must not be so high as to separate (denature) the extension product from its template, as 
described above. Hybridization may occur during this step, so that the previous step of 
cooling after denaturation is not required. In such a case, using simultaneous steps, the 
30 preferred temperature range is 50-70°C. 

The heating and cooling steps involved in one cycle of strand separation, 
hybridization, and extension product synthesis can be repeated as often as needed to 
produce the desired quantity of the specific nucleic acid sequence. The only limitation 
is the amount of the primers, thermostable enzyme, and nucleotide triphosphates 
35 present. Usually, from 15 to 30 cycles are completed. For diagnostic detection of 
amplified DNA, the number of cycles will depend on the nature of the sample. For 
example, fewer cycles will be required if the sample being amplified is pure. If the 
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sample is a complex mixture of nucleic acids, more cycles will be required to amplify 
the signal sufficiently for detection. For general amplification and detection, the 
process is repeated about 15 times. When amplification is used to generate sequences 
to be detected with labeled sequence-specific probes and when human genomic DNA is 
5 the target of amplification, the process is repeated 15 to 30 times to amplify the 
sequence sufficiently that a clearly detectable signal is produced, i.e., so that 
background noise does not interfere with detection. 

No additional nucleotides, primers, or thermostable enzyme need be added after 
the initial addition, provided that no key reagent has been exhausted and that the 

10 enzyme has not become denatured or inactivated irreversibly, in which case additional 
polymerase or other reagent would have to be added for the reaction to continue. 
Addition of such materials at each step, however, will not adversely affect the reaction. 
After the appropriate number of cycles have been completed to produce the desired 
amount of the specific nucleic acid sequence, the reaction may be halted in the usual 

15 manner, e.g., by inactivating the enzyme by adding EDTA, phenol, SDS, or CHC1 3 or 
by separating the components of the reaction. 

The amplification process may be conducted continuously. In one embodiment 
of an automated process, the reaction mixture may be temperature cycled such that the 
temperature is programmed to be controlled at a certain level for a certain time. One 

20 such instrument for this purpose is the automated machine for handling the 

amplification reaction developed and marked by Perkin-EImer Cetus Instruments. 
Detailed instructions for carrying out PCR with the instrument are available upon 
purchase of the instrument. 

Tth DNA polymerase is very useful in carrying out the diverse processes in 

25 which amplification of a nucleic acid sequence by the polymerase chain reaction is 
useful. The amplification method may be utilized to clone a particular nucleic acid 
sequence for insertion into a suitable expression vector, as described in U.S. Patent 
No. 4,800,159. The vector may be used to transform an appropriate host organism to 
produce the gene product of the sequence by standard methods of recombinant DNA 

30 technology. Such cloning may involve direct ligation into a vector using blunt-end 
ligation, or use of restriction enzymes to cleave at sites contained within the primers. 
Other processes suitable for Tth polymerase include those described in U.S . Patent 
Nos. 4,683,194; 4,683,195; and 4,683,202 and European Patent Publication Nos. 
229,701; 237,362; and 258,017; these patents and publications are incorporated herein 

35 by reference. In addition, the present enzyme is useful in asymmetric PCR (see 

Gyllensten and Erlich, 1988, Ptqc. Nail Acad. Sci . USA 35:7652-7656, incorporated 
herein by reference); inverse PCR (Ochman £| at, 1988, Genetics 120:621. 
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incorporated herein by reference); and for DNA sequencing (see Innis gjt 1988, 
Proc . NatL Acad . ScL USA &&9436-9440, and McConlogue & 1988, Njj£. Apids 
Res . 1£(20):9869). Tth polymerase also has reverse transcriptase activity. 

The following examples are offered by way of illustration only are by no means 
5 intended to limit the scope of the claimed invention. In these examples, all percentages 
are by weight if for solids and by volume if for liquids, unless otherwise noted, and .all 
temperatures are given in degrees Celsius. 

Example 1 
Purification of Thermus thermophilus 

10 DNA Polymerase 

This example describes the isolation of Tth DNA polymerase from Thermus 
thermophilus . Tth DNA polymerase was assayed at various points during purification 
according to the method described for Taq polymerase in Lawyer £l si., 1989, J. Biol. 
Chem . 264(1 1):6427-6437, incorporated herein by reference. 

15 Typically, this assay is performed in 50 pi of a reaction mixture composed of 

25 mM TAPS-HC1, pH 9.5 (20°C); 50 mM KC1; 2 mM MgCl 2 ; lmMp- 
mercaptoethanol; 200 pM in each of dATP, dGTP, and TIP; 100 a-32p-dCTP 
(0.03 to 0.07 fiCi/nmol); 12.5 jig of activated salmon sperm DNA; and polymerase. 
The reaction is initiated by addition of polymerase in diluent (diluent is composed of 10 

20 mM Tris-HCl, pH 8.0, 50 mM KC1, 0. 1 mM EDTA, 1 mg/ml autoclaved gelatin, 0.5% 
NP40, 0.5% Tween 20, and 1 mM p-mercaptoethanol), and the reaction is carried out 
at 74°C. After a 10 minute incubation, the reaction is stopped by adding 10 jil of 60 
mM EDTA. The reaction mixture is centrifuged, and 50 fal of reaction mixture is 
transferred to 1.0 ml of 50 |Xg/ml carrier DNA in 2 mM EDTA (at 0°C). An equal 

25 volume (1 ml) of 20% TCA, 2% sodium pyrophosphate is added and mixed. The 
mixture is incubated at 0°C for 15 to 20 minutes and then filtered through Whatman 
GF/C filters and extensively washed (6 x 5 ml) with a cold mixture containing 5% TCA 
and 1% pyrophosphate, followed by a cold 95% ethanol wash. The filters are then 
dried and the radioactivity counted. Background (minus enzyme) is usually 0.001% to 

30 0.01% of input cpm. About 50 to 250 pmol of 32 P-dCTP standard is spotted for unit 
calculation. One unit is equal to 10 nmol of dNTP incorporated in 30 minutes at 74°C. 
Units are calculated as follows. 
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sample cpm - en zyme dil. cpm = pmol incorporated 
specific activity of dCIP (cpm/pmol) 

pmol incorporated x 3 x dilution factor x 4 = units/ml 
4.167 x 10 

5 Enzyme activity is not completely linear with time. With purified enzyme, a thirty 
minute assay is usually 2.5 x a 10 minute assay. 

About 202 g of frozen Thermus thermophilus strain HB8 cells (ATCC No. 
27,634) were thawed in 100 ml of 3X TE-DTT buffer (150 mM Tris-Cl, pH 7.5, 3 
mM EDTA, and 3 mM dithiothreitol) containing 2.4 mM PMSF (from 144 mM stock in 

10 DMF) and homogenized at low speed in a blender. All operations were carried out at 0 
to 4*C unless otherwise stated. All glassware was baked prior to use, and solutions 
used in the purification were autoclaved, if possible, prior to use. The thawed cells 
were lysed in an Aminco French pressure cell (18,000 psi), then diluted with an equal 
volume of IX TE-DTT buffer containing 2.4 mM PMSF and sonicated to reduce 

15 viscosity (1/3 aliquots, 80% output, 10 minutes, 50% duty cycle). The lysate was 
diluted with additional IX TE-DTT buffer containing fresh 2.4 mM PMSF to final 
5.5X cell wet weight. The resulting fraction, fraction I (1,100 ml), contained 15.6 g of 
protein and 46.8 x 10 4 units of activity. 

Ammonium sulfate was added to 0.2 M (29.07 g) and the lysate stirred for 30 

2o minutes on ice. Upon the addition of the ammonium sulfate, a precipitate formed 
which was not removed prior to the PEI precipitation step, described below. 
Ammonium sulfate prevents the Tth polymerase from binding to DNA in the crude 
lysate and reduces ionic interactions of the DNA polymerase with other cell lysate 
proteins. Speed in the initial steps of purification (ie., up to loading onto and eluting 

25 from the phenyl-sepharose column) and the presence of protease inhibitor (PMSF at 
2.4 mM) are important for protection from proteolytic degradation of the DNA 
polymerase. For best results, then, one proceeds directly to the Polymin P (purchased 
from BDH) precipitation step to remove most nucleic acids rather than introducing a 
centrifugation step to remove the precipitate that forms upon the addition of ammonium 

30 sulfate. For the same reason, one can include in fraction H the soft, viscous pellet that 
forms on top of the Polymin P/ammonium sulfate pellet, because the viscous pellet 
does not contain nucleic acids. Agarose gel electrophoresis and ethidium bromide 
staining of the Polymin P supernatant indicates that >90% of the macromolecular DNA 
and RNA is removed by 0.2% Polymin P. To account for the additional amount of 

35 protein, when the viscous pellet is included, the phenyl sepharose column should then 
be -10% larger than described below. 
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Empirical testing showed that 0.2% Polymin P (polyethyleneimine, PEI) 
precipitates ^90% of the total nucleic acid. Polymin P (pH 7.5) was added slowly to 
0.2% (22 ml of 10% PEI) and the slurry stirred one hour on ice, then centrifuged at 
30,000xg at 4°C for 45 minutes. A soft, viscous pellet formed on top of the PEI pellet, 
5 requiring additional centrifugation after 920 ml of the supernatant was decanted. The 
viscous material was centrifuged for one hour at 186,000xg at 2°C and yielded an 
additional 40 ml of supernatant and very large gelatinous pellets. These pellets 
contained <2% of the activity present in fraction I and 1.96 g of protein or 12.5% of 
fraction I. The supernatants were pooled (fraction n, 960 ml) and contained 10.5 g 

10 protein and 42.6 x 10 4 units of activity. 

Fraction II was loaded onto a 3.2 x 6.5 cm (52 ml) phenyl sepharose CL-4B 
(Lot MI 02547, purchased from Pharmacia-LKB) column (equilibrated in TE 
containing 0.2 M ammonium sulfate and 0.5 mM DTT) at 80 ml/hr (10 ml/em2/hr). All 
resins were equilibrated and recycled according to the manufacturers 

15 recommendations. The column was washed with 240 ml of the same buffer (A280 to 
baseline), then with 220 ml TE containing 0.5 mM DTT (no ammonium sulfate) to 
remove non-Tth DNA polymerase proteins* The column was then washed with 270 ml 
of 20% ethylene glycol in TE containing 0.5 mM DTT to remove more contaminating 
protein, and the Tth polymerase activity was eluted with 2 M urea in TE containing 

20 20% ethylene glycol and 0.5 mM DTT. The fractions (5 ml) containing the polymerase 
activity were pooled (fraction Ilia, 84 ml). The routine activity assays of the flow- 
through and wash fractions revealed that only -50% of the applied polymerase activity 
had bound when the capacity of the column was exceeded. To avoid exceeding the 
capacity of the column, a larger column (with, for example, at least 2X as much phenyl 

25 sepharose) should be used. The flow-through and wash fractions containing the 

balance of the activity were pooled (fraction lib, 685 ml), adjusted to 0,2 M ammonium 
sulfate, and then reapplied to the same column after the column had been recycled and 
reequilibrated. 

Assays of low levels of Tth DNA polymerase activity in fractions containing 
30 Polymin P (e.g., phenyl sepharose flow-through fractions) should be conducted in the 
presence and absence of 10 mM EDTA. The presence of EDTA permits correction for 
elevated background levels of radioactivity due to Polymin P binding of the nucleotide 
triphosphate substrate. 

As noted above, the Tth polymerase activity was eluted with a 2 M urea step 
35 (fraction Ilia). The eluant was dialyzed into heparin sepharose loading buffer to avoid 
prolonged exposure to urea (to avoid carbamylation) while waiting for the unretained 
fraction lib to be rerun over the same phenyl sepharose column. The dialyzed fraction 
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Ilia contained 42% of the applied activity (179,213 units) and about 3.5% of the 
applied protein (351 mg), yielding a 12-fold purification. The pooled flow-through and 
0.2 M ammonium sulfate wash fractions containing the unbound Tth DNA polymerase 
(fraction lib) consisted of 42.6% of the applied activity (181,559 units) and 40.8% of 
5 the applied protein (4, 1 10 mg). The column was recycled as recommended by the 
manufacturer, reequilibrated with the starting buffer, and fraction lib was reapplied. 

Fraction lib was loaded onto the phenyl sepharose column at 78 ml/hr. The 
column was washed with 270 ml of 0.2 M ammonium sulfate in TE containing 0.5 mM 
DTT, then with 170 ml TE containing 0.5 mM DTT (no ammonium sulfate), and finally 
10 with 260 ml of 20% ethylene glycol in TE containing 0.5 mM DTT. The Tth 

polymerase activity was again eluted with 2 M urea in TE containing 20% ethylene 
glycol and 0.5 mM DTT. The fractions (4.3 ml) containing the polymerase activity 
were pooled (fraction mb). The 2 M urea eluate (fraction Hlb) contained 87.6% of the 
applied activity (159,132 units) and 8.8% of the applied protein (363 mg), yielding a 
15 9.7 fold purification. 

Fraction mb (1 16.4 ml) was adjusted to 0. 15 M KC1 and pooled with fraction 
nia, which had been dialyzed without loss of activity into a buffer composed of 50 mM 
Tris-Cl, pH 7.5, 0.1 mM EDTA, 0.2% Tween 20, 0.5 mM DTT, and 0.15 M KCI and 
stored at 4°C. The pooled fraction HI (243 ml) contained substantial levels of 
2o contaminating specific and non-specific Tth endonucleases and exonucleases. The 
combined fraction in contained 326,009 units of activity and 705 mg protein. 

Fraction HI was loaded onto a 2.2 x 12 cm (45 ml) heparin sepharose CL-6B 
(purchased from Fhaimacia-LKB) column and equilibrated in 0. 1 5 M KCI, 50 mM Tris- 
Cl, pH 7.5, 0.1 mM EDTA, 0.2% Tween 20, and 0.5 mM DTT) at 45 ml/hr. All of 
25 the applied activity was retained by the column. The column was washed with 175 ml 
of the same buffer (A 2 80 to baseline) and eluted with 670 ml of a linear 150-750 mM 
KCI gradient in the same buffer. Fractions (5.25 ml) eluting between 0.31 and 0.355 
M KCI were pooled (fraction IV, 149 ml). Similar to Taq DNA polymerase, which 
elutes with a peak at 0.3 1 M KCI, Tth DNA polymerase elutes with a peak at 0.33 M 
30 KCI contaminated with the coeluting TfhHB8I endonuclease (an isoschizomer of lagl 
endonuclease [TCGA]). 

Fraction IV was concentrated -10-fold on an Amicon YM30 membrane and 
subsequently dialyzed against 25 mM Tris-Cl, pH 7.5, 0.1 mM EDTA, 0.2% Tween 
20, 0.5 mM DTT, and 100 mM KCI. A precipitate formed during dialysis and was 
35 removed by centrifugation (10 minutes at 12,000xg, 4°C) without loss of activity. 
These steps, including the heparin sepharose column, yielded a 27-fold purification, 
with 95% of the activity applied to the heparin-sepharose column being recovered. 
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Although Tth DNA polymerase shares 88% sequence identity (93% similarity) 
with Taq DNA polymerase, the -10% difference in the two proteins changes their 
purification properties on phosphocellulose significantly. In contrast to Taq DNA 
polymerase, which, when run in pH 7.5 Tris buffer, elutes at 0.2 M KC1 from 
5 phosphocellulose with its contaminating endonuclease eluting at -0.6-0.8 M KC1, Tth 
DNA polymerase and endonuclease cannot be easily separated on phosphocellulose. 
Tth DNA polymerase elutes with a peak at -0.45 M KCI and the Tth endonuclease peak 
is at 0.58 M KCI. Affigel-blue (Biorad Laboratories), however, is a useful resin for 
separating Tth endonuclease from Tth DNA polymerase. Affigel blue is a dye-ligand 

10 resin used for affinity purification of enzymes with binding sites for nucleotides. 

The supernatant from centrifugation of fraction IV (16.8 ml) was loaded onto a 
1.6 x 10 cm (20 ml) affigel-blue column (equilibrated in 25 mM Tris-Cl, pH 7.5, 0.1 
mMEDTA, 0.2% Tween 20, 0.5 mM DTT, and 100 mM KCI) at 20 ml/hr. All of the 
applied Tth DNA polymerase activity bound to the resin. The column was washed with 

15 30 ml of the same buffer (A28O to baseline) and eluted with a 300 ml linear 0. 1-0.5 M 
KCI gradient in the same buffer. Fractions (3.05 ml) eluting between 0.28 and 0.455 
M KCI were assayed to ensure absence of contaminating double- and single-strand 
endonuclease, indicated by absence of both lower molecular weight specific or non- 
specific DNA fragments after one hour or eleven hours incubation at 60°C with 5-20 

20 units of Tth polymerase activity using 600 ng of plasmid pLSGl covalently-closed 
circular DNA or 850 ng of M13mpl8 SS-DNA. When the KCI gradient was applied, 
the Tth polymerase eluted with a fairly broad peak at -0.35 M KCI, while the 
endonuclease seemed to elute at >0.5 M KCI. Washing the affigel-blue column with 
0.15 M KCI and eluting with a linear 0. 15-0.6 M KCI gradient may provide better 

25 separation. 

Based on the SDS-PAGE pattern, two pools were made: fraction Va from peak 
fractions (61 ml) and fraction Vb, from flanking fractions (72.5 ml). Fraction Va 
contained 22.2 x 10 4 units of activity and 5.5 mg o f protein, and fraction Vb contained 
5.2 x 10 4 units of activity and 3.5 mg of protein. Both pools were concentrated 

30 separately by diafiltration on YM30 membranes. Fraction Vb was concentrated -10- 
fold on an Amicon YM30 membrane, then dialyzed into CM-Trisacryl buffer (25 mM 
sodium acetate buffer, pH 5.0, 0.5 mM DTT, 0.1 mM EDTA, and 0.2% Tween 20) 
containing 50 mM NaCl. Again, a precipitate formed during dialysis and was removed 
by centrifugation (12,000xg for 10 minutes at 4°C) resulting in a minor (<2%) loss of 

35 activity and a 1.4-fold purification. The resulting supernatant (8.6 ml, 5.1 x 10 4 units 
of activity and 2.3 mg of protein) was loaded onto a 1 x 3.8 cm (3 ml) CM-Trisacryl 
column (equilibrated in CM-Trisacryl buffer and 50 mM NaCl) at 3 ml/hr. All of the 
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applied activity was retained by the column. The column was washed with 17 ml of the 
same buffer and eluted with 50 ml of a steep, linear 0.05-0.7 M NaCl gradient in the 
same buffer. Fractions (1 ml) eluting between 0.175 and 0.25 M were analyzed by 
SDS-PAGE electrophoresis prior to being pooled with fraction Va. The Tth DNA 
5 polymerase activity eluted with a sharp peak at 0.21 M NaCL Judged by SDS-page of 
the gradient fractions, the polymerase was significantly enriched but still contained 
major contaminating bands at -35 kDa, -25 kDa, and -18 kDa. The resulting fraction 
V (1 1.4 ml), which contained fraction Va and the peak fractions from the CM-Ttisacyl 
column treatment of Fraction Vb, was dialyzed into CM-Trisacryl buffer containing 50 
10 mM NaCL More precipitate formed and was removed by centrifugation (10 minutes at 
12,000xg, 4°C) with insignificant loss of activity. The precipitate contained 0.9 1 mg 
protein (-20%) and 2,227 units of activity (<1 %). 

The resulting supernatant (12.8 ml, containing 5.18 mg protein and 24.8 x 10* 
units of activity) was loaded onto a 1.6 x 6.0 cm (12 ml) CM-Trisacryl (purchased 
15 from Pharmaeia-LKB) column (equilibrated in CM-Trisacryl buffer containing 50 mM 
NaCl) at 12 ml/hr. The column was washed with 20 ml of the same buffer containing 
50 mM NaCl, then with 27 ml of the same buffer containing 100 mM NaCL No 
detectable polymerase activity appeared in the flow-through fractions. A technical 
problem (column adaptor broke) led to the immediate elution (in 400 mM NaCl) of the 
20 activity when the 100-400 mM Nad linear gradient was applied. Seventy-eight percent 
of the applied activity (19.4 x 1(H units and 4.09 mg protein) was recovered and 
reapplied to a CM-Trisacryl column of the same dimensions. 

The loading fraction (35 ml) was 2.7-fold diluted after readjusting the solution 
to 50 mM NaCL The column was washed with 33 ml of the same buffer and eluted 
25 with a 180 ml linear 50-400 mM NaCl gradient in the same buffer. Fractions (1.4 ml) 
eluting between 0. 16 and 0.2 M NaCl were separately concentrated/diafiltered on 
Centricon 30 membranes in 2.5X storage buffer (50 mM Tris-CI, pH 7.5, 250 mM 
KC1, 0.25 mM EDTA, 2.5 mM DTT, and 0.5% Tween 20 [Pierce, Surf act- Amps]). 
The Tth DNA polymerase activity eluted with a peak at 0.183 M NaCl, slighdy earlier 
30 than was observed in the trial column. In comparison, Taq DNA polymerase elutes at 
0.19-0.205 M NaCl when run on CM-Trisacryl in the same pH 5.0 sodium acetate 
buffer. The concentrated and diafiltered samples were diluted with 1.5 volumes of 
80% glycerol (Fisher, spectral grade, autoclaved) and stored at -20°C until completion 
of the analysis of the individual fractions by SDS-PAGE. The fractions containing the 
35 Tth polymerase were of similar purity (-85-90%), as determined by SDS-PAGE gel 
electrophoresis. The major band migrates as a -90 kDA protein in this gel system with 
minor contaminating bands. The discrepancy between this observed molecular weight 
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(**90 fcDa) and the calculated molecular weight (--94 kDa, from the gene sequence) may 
simply be due to anomalous gel migration or to degradation during the purification 
process. The staining patterns of the individual fractions were similar enough to allow 
pooling of all of the fractions (fraction VI, 21 .5 ml). 
5 Fraction VI was further concentrated/diafiltered on an Amicon YM30 membrane 

in 2.5X storage buffer. When the volume measured 7 ml, 0.2 ml were removed for 
amino acid composition and sequence analysis. The remaining 6.8 ml were 
concentrated to 1.6 ml and diluted with 2.4 ml of 80% glycerol. The resulting final 
pool (4 ml) contained 2.17 mg protein and 162,789 units of activity (34.8% yield) with 
10 a specific activity of 75,018 units/mg protein. The results of each step of the 
purification are presented in tabular form below. 
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Example 2 

Cloning the Thermus thermophilus Tth DNA Polymerase I Gene 
This Examples describes the strategy and methodology for cloning the Tth DNA 
polymerase I (Tth Pol I) gene of Thermus thermophilus . PCR-amplified fragments of 
5 the X aquaticus DNA polymerase I (Taq Pol I) gene were used to probe genomic DNA 
blots to determine the restriction sites present in the Tth Pol I gene and flanking 
regions. PGR amplification of the Tth Pol I gene with Taq Pol I-specific primers 
provided even more restriction site and DNA sequence information about the Tth Pol I 
gene. This information provided the basis for a two-step cloning procedure to isolate 
10 the Tth Pol I gene into plasmid pBS 13+ (marketed by Stratagene; the plasmid is also 
known as BSM13+). 

A. Preparation of Probes 

Four labeled probes were generated by PGR in the presence of biotinylated 
dUTP (biotin- 1 1 -dUTP, purchased from Bethesda Research Laboratories) and 

15 Thermus aquaticus DNA to probe southern blots of X thermophilus genomic DNA. 
Probe A was generated with primers CM07 and EK194 and encompasses 438 bp of the 
5' end of the Taq Pol I gene from nucleotide -230 to +207. Probe B was generated 
with primers MK138 and MK124 and encompasses 355 bp that span the Hin dTTT site of 
the Taq Pol I gene and extend from nucleotide +555 to +879. Probe C was generated 

20 with primers MK143 and MK131 and encompasses 579 bp of the template-primer 

binding site coding sequence and the BamHI site of the Taq Pol I gene from nucleotide 
+1313 to +1891. Probe D was generated with primers MK130 and MK151 and 
encompasses 473 bp of the 3* end of the Taq Pol I gene from nucleotide +2108 to 
+3384. 

25 The sequences of the primers used to prepare the probes are shown below: 

CM07 5 ! -GCGTGGCGGCGGAGGCGTTG 

EK194 5-CTTGGCGTCAAAGACCACGATC 

MK124 S-GGCCTTGGGGCnTCCAGA 

MK130 5-TGCGGGCCTGGATTGAGAAG 
30 MK131 5-CCCGGATCAGG1TCTCGTC 

MK138 5-GACCGGGGACGAGTCCGAC 

MK143 S'-CCGCTGTCCTGGCCCACATG 

MK151 5-TTCGGCCCACCATGCCTGGT 
The sequence of the Taq Pol I gene is disclosed in Lawyer g£ al. and in U.S. Patent 
35 application Serial No. 143,441, filed January 12, 1988, both incorporated herein by 
reference. 
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The probes were individually prepared in 100 fil of total reaction mixture 
composed of 10 mM Tris-HCl, pH 9.0 (the pH was set at nine to counteract the pH of 
the biotinylated dUTP in the reaction mixture; the biotinylated dUTP is in a buffer of 
100 mM Tris, pH 7.4), 50 mM KC1, 1.0 mM MgCl 2 , 100 (lg/ml gelatin, 2 U of Taq 
5 Pol I (marketed by Perkin-Elmer Cetus Instruments), 50 jxM dATP, 50 jiM dCTP, 50 
^lM dGTP, 37.5 (iM TTP, 12.5 |lM biotin-1 1-dUTP, 50 pmol each primer and 
template DNA. The template DNA consisted of 1 jil of a 1:100 dilution of PCR 
products generated with the same primers in 25 cycles of a polymerase chain reaction in 
a reaction mixture composed of 10 mM Tris-HCl, pH 8.3; 1.5 mM MgCl 2 ; 200 |LiM 

10 each dNTP; no biotinylated dUTP; and 1 .0 ng Taq genomic DNA boiled for three 

minutes and then quickly cooled on ice. PCR was performed in a Perkin-Elmer Cetus 
Instruments Thermal Cycler. Probes and the template for probe generation were 
generated using 15 cycles of a 1 minute 45 second ramp to 98°C, 15 seconds at 98°C 
(in-tube temperature of 96.5* C), 45 second ramp to 55°C, 20 seconds at 55 °C, 45 

15 second ramp to 72°C, and 30 seconds at 72°C. There was a 5 minute soak at 72°C at 
the end of the last cycle. 

The genomic DNA hybridized to the probes was isolated as described in 
Lawyer ££ gL, and Southern blots were performed as described by Maniatis, except that 
MSI Magnagraph™ nylon membrane was used rather than nitrocellulose, and the DNA 

20 was fixed to the membrane with UV light (in a UV Stratalinker™ 1800, marketed by 
Stratagene) rather than heat. 

Blots were prehybridized at 42°C for 2 hours in a solution composed of 5X 
SSPE, 5X Denhardt's solution, 0,5% SDS, 5% dextran sulfate, 150 |Lig/ml carrier 
DNA, and 50% formamide. Hybridization of probes to the blots was carried out 

25 overnight at 42°C in the same solution with probe present at approximately 10 ng/mL 
After hybridization, the membranes were washed to remove unbound probe. 

Each of the four probes A-D hybridized to Thermus thermophilus genomic 
DNA. A restriction site map of the Tth Pol I gene region of the genome was 
constructed by individually digesting and probing Southern blots of the digested Tth 

30 genomic DNA with restriction enzymes Ps& BamHI, JSacII, and Asg718. In addition, 
double digestions with HindIII /Asp7 18. Hind m/BstE n: HmdmyNhel; 
BamHI/^52718; B^rjHI/B^Ell; BamH I /SphI : and BamH I/NheT of Tth genomic DNA 
followed by Southern blotting and probing of the digested DNA were performed. The 
resulting information allowed the construction of a restriction site map used in the 

35 cloning of the Tth Pol I gene. 
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B. PCR Amplification of the Primer - Template Binding Site Region of the Tth Pol 
I Gene 

A series of PCR amplifications was carried out using Tth genomic DNA as 
5 template and primers homologous to Thermus aquaticus DNA in the region of the Taq 
Pol I gene that encodes the primer-template binding site sequence of Taq Pol I. Several 
primer pairs in various combinations were used in the amplifications, which were 
targeted to amplify various regions of the Tth DNA Pol I gene corresponding to the 
region from nucleotide 293 to 1891 of the Taq Pol I gene. One primer pair, MK143 
10 and MK1 31, yielded product. 

The amplification reactions were carried out in a buffer composed of 10 mM 
Tris-HCl, pH 83, 50 mM KC1, 1.5 mM MgCl 2 , 200 |LiM each dNTP, 2 U Taq Pol I, 1 
ng heat-denatured Tth genomic DNA, and 50 pmol of each primer. The amplifications 
were carried out for 25 cycles using the same thermocycler programming described 
15 above, and PCR products were analyzed via polyacrylamide gel electrophoresis. 

Most of the primers used in the unsuccessful amplifications either had many 
mismatches when later compared with the Tth Pol I gene sequence or had strategic 
mismatches at the 3 f end of the primers. Primer MK143 had 3 mismatches to the Tth 
Pol I gene sequence but those mismatches were located at the 5* end of the primer and 
2o were followed by 15 bases of homology. Primer MK131 had 2 mismatches to the Tth 
Pol I gene, but the mismatches were located in the middle of the primer. 

The product of the MK143/MK13 1 amplification of Tth genomic DNA migrated 
on a polyacrylamide gel identically with the MK143/MK131 amplification product 
using Taq genomic DNA as template. Restriction mapping of these Taq and Tth 
25 amplification products show identical BamH L SacL and Xhol restriction sites but 
different Sac II and Pst I restriction sites. The Tth PCR product generated with primers 
MK143 and MK131 was further amplified via asymmetric PCR with the same primers 
and subjected to DNA sequence analysis in accordance with the methods described in 
Gyllensten and Erlich, 1988, Proc . Natl . Acad . Sci . USA &>(20):7652-7656; and Innis 
30 £t 2!., 1988, Ptq£. Nail. Acad . Sci. USA ££9436-9440. 

£L Cloning the 5' End of the Tth Pol I Gene 

From the restriction site map and sequence information generated by the 
Southern blot and PCR analyses, a strategy for cloning the Tth Pol I gene in two steps 
was developed. An -3 kb Hin dm fragment of Tth genomic DNA hybridized with 
35 probes A, B, and C but not D, indicating that the fragment contains the 5' end of the 
Tth Pol I gene. This -3 kb Hindm fragment also contained a B^mHI restriction site, 
which proved useful in cloning the 5* end of the gene. 
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To clone the 5 1 end of the Tth Pol I gene, a Hindin digest of Tth genomic DNA 
was size fractionated by electroelution on a 0.5 inch tube gel by collecting 250 jjJ 
fractions every 5 minutes during electrophoresis as fragments of about 3 kb in size 
were eluting from the gel. Dot blots with the probes described above identified the 
5 fractions containing the restriction fragments of interest The fractionated DNA of 

interest was then digested with restriction enzyme BamH I and treated with calf-intestine 
alkaline phosphatase (CIAP). CIAP was purchased from Boehringer Mannheim and 
used as directed by the manufacturer. Restriction enzymes, E. coli DNA polymerase, 
and ligase enzymes used in these Examples can be purchased from manufacturers such 

10 as New England Biolabs, Boehringer Mannheim (Asp7 181 and Promega (£sel45I, an 
isoschizomer of AguE) and used as directed by the manufacturer. 

Plasmid pBS 13+ (pin-chased from Stratagene) was likewise digested with 
restriction enzymes Hindm and BamHI and then ligated with the BamH I digested, 
CIAP-treated -3 kb Hin dm fragment pool. The ligation mixture was used to transform 

15 E. £Qli K12 strain DG 98 (thi-1, sndAl, hsdR 17. lgcIQ, laeZAMIS, emC::Tnl0, 
SHEE44/F 1 , la^IQ, lacZAMlS, pmC+, available from the ATCC under accession 
number 39,768) in substantial accord with the procedure of Hanahan The 
ampicillin resistant (AmpR) transformants were screened by failure to exhibit blue color 
on X-gal plates and by probe hybridization with the DNA of transformed cells (via 

20 replica plating and lysis of the replicated cells as described by Woods ££ fll., 1982, 

Prop. Natl. Acad. £ri. USA 22:5661) with 32p_i a beled (by kinase treatment with y-^P- 
ATP) primer MK143. One colony contained a plasmid, designated pBSM:Tth5'; in 
which the -2.5 kb HmdlH-BgmHI restriction fragment had ligated with the large 
HmdlH-BamEH restriction fragment of plasmid pBS13-h 

25 Cloning the 3' End of the Tth Pol I Gene 

The 3 1 end of the Tth Pol I gene was inserted into plasmid pBSM:Tth5 f to yield 
a vector, designated pBSM:Tth, that contains the intact coding sequence of the Tth Pol I 
gene. The Southern blot and DNA sequence information showed that an -12 kb 
BamHI fragment of Tth genomic DNA could be digested with Asp7 18 to yield an -5.6 

30 kb fragment that hybridized with Probe D (the fragment should also hybridize with 

Probe C). The information also showed that the BamH I site used to create the -5.6 kb 
BamHI-Ag2718 restriction fragment was the same BamH I site used to create the -2,5 
kb Hindlll 'BamH I restriction fragment in plasmid pB SM:Tth5\ 

Tth genomic DNA was then digested to completion with restriction enzyme 

35 BamHI and size-fractionated as described above, except that fractions containing 

fragments of -12 kb in size were identified and collected. Fractions which hybridized 
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in a dot blot to biotinylated Probes D and C were pooled, digested with restriction 
enzyme Asp7 18, treated with CIAP, and ligated with BamH I-As p 718 digested plasmid 
pBSM:Tth5 f . The ligated DNA was transformed into E qqH K12 strain DG101 (thi-1, 
£DdAl, hsdR17, kelQ, MZAM15, pmC::TnlO). 

The AmpR transformants were screened as above with 32p-i a beled primer 
MK132 to identify several colonies that contained a plasmid, designated pBSM.Tth, 
that contained the -5.6 kb BamH I-As p7 18 and -2.5 kb HindTTT -BarnH I fragments in 
the correct orientation to reconstruct an intact coding sequence of the Tth Pol I gene. 
The sequence of oliognucleotide MK132 perfectly matches the Tth Pol I gene sequence. 
Several colonies with plasmid DNA that hybridized to the probe and yielded the 
expected fragments on restriction enzyme digestion were induced with IPTG, and 
Western blot analysis of protein samples from induced and uninduced colonies with 
Taq Pol I polyclonal antibody showed an IPTG inducible band the same size (-94 kDa) 
as Taq Pol I. One such colony was deposited with the ATCC and can be obtained from 
the ATCC under accession number ATCC 68195. When culturing the strain, one must 
maintain selective pressure (ampicillin) to prevent loss of plasmid DNA. ATCC 68195 
can thus also be used to obtain untransformed DG101 cells. 

Example 3 
Construction of Plasmid pLSG21 
The deletion of 3 1 noncoding ("downstream") sequences has been shown to 
enhance recombinant expression of Thermus DNA polymerase in E. coli . In 
pBSM:Tth, double digestion with restriction enzymes BstEII and Kpn l followed by 
Klenow repair in the presence of all four dNTPs and ligation under dilute conditions to 
favor intramolecular ligation results in the deletion of 3' noncoding sequences of the 
25 Tth DNA Pol I gene. Restriction enzyme BstEII cuts plasmid pBSM:Tth in the 3' 
noncoding region of the Tth Pol I gene, and restriction enzyme Kpnl cuts in the 
polylinker region of the vector. 

This deletion was made, and the resulting plasmid was designated as plasmid 
pLSG21. The deletion protocol results in the regeneration of the BstEII restriction site. 
30 However, plasmid pLSG21 does not drive increased levels of Tth Pol I expression 
when compared to the levels achieved in plasmid pBSM:Tth-transformed E. coli host 
cells. 
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Example 4 

Construction of Plasmids pLSG2 2. pLSG23. and pLSG24 
The Tth Pol I gene lacks convenient restriction sites at the 5 r and 3 ! ends of the 
gene. Such restriction sites facilitate the construction of a wide variety of expression 
5 vectors. In addition, codons at the 5 1 end of the coding sequence are highly GC-rich, 
which may inhibit efficient translation initiation and expression in E. colL Site-directed 
mutagenesis with oligonucleotides can be, and has been, used to introduce a number of 
useful changes in the coding sequences and in the 5 l and 3' noncoding regions of the 
Tth Pol I gene. 

10 Plasmid pBS 13+ derivatives, such as plasmid pBSMrTth, can be obtained in 

single-stranded form by the protocols described in Lawyer £i fil. and by Stratagene, the 
commercial supplier of plasmid pBS13+. To make single-stranded plasmid pBS13+ or 
a single-stranded derivative plasmid, a host cell transformed with the plasmid is 
infected with a helper phage (such as R408) and cultured under conditions that allow 

15 production of phage DNA. The phage DNA is then collected and comprises the desired 
single-stranded DNA and a small amount of helper phage DNA. The desired DNA can 
be purified to remove the helper phage DNA by separating the DNA based on size, Le., 
by electroelution. 

For the constructions described below, a plasmid, designated pBSMAPvuII, 

20 proved useful. Plasmid pBSMAPvuII was generated by deletion of the 382 bp PvuII 
fragment of plasmid pBS 13+. The site-specific mutagenesis protocols involved the 
following steps: (1) single-stranded plasmid pBSMrTth (or other pBS 13+ single- 
stranded derivative) and double-stranded, PvuI I digested plasmid pBSMAPvuII were 
annealed by boiling a 1 to 2.5 molar ratio of pBSMrTth (or other plasmid pBS 13+ 

2 5 derivative)/pBSMAPvuII for three minutes in Klenow salts and then incubating the 
resulting mixture at 65°C for 5 minutes; (2) kinased mutagenizing oligonucleotide was 
then annealed to the resulting gapped duplex at a molar ratio of 5 to 1 by heating the 
oligonucleotide to 95°C for 1 minute and then adding the oligonucleotide to the gapped 
duplex mixture held at 75°C; (3) the resulting mixture was incubated at 75°C for 2 

30 minutes and then slowly cooled to room temperature; (4) this annealed mixture was 
then extended with Klenow enzyme in the presence of all four dNTPs (200 jjM in each 
dNTP) for 15 minutes at 37°C with the addition of ligase and 40 |iM ATP to the 
reaction. The resulting mixture was used to transform E. qq& K12 DG101. 

The AmpR transformants were screened by probing with the appropriate 

35 screening primer. Colonies that had plasmid DNA that hybridized to the probe were 
expanded into 3 ml cultures in R66 media (0.6% beef extract, 0.6% yeast extract, 2% 
peptone, 0.5% NaCl, 40 mM KPO4, pH 7.2, 0.2% glucose, and 100 M,g/ml 
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ampicillin), incubated at 37°C for eight hours, and then used to prepare plasmid DNA 
by the method of Birnboim and Doly. The resulting plasmid DNA was subjected to 
restriction enzyme and DNA sequence analysis to ensure that the desired plasmid was 
obtained. 

5 A/ Construction of Plasmid pLSG22 

EcoR V and B g lH restriction enzyme sites were introduced downstream of the 
TGA stop codon of the Tth Pol I gene coding sequence by the foregoing method using 
oligonucleotide DG122 to mutagenize plasmid pBSMrTth and oligonucleotide DG123 
to identify the desired transformants by probe hybridization. These oligonucleotides 
10 are shown below: 

BglH EcoRV 

DG122 5 f rrTrTAAAr^HAnATflTGATATCAACCCTTGGCGGAAAGC 3 ? 

DG123 5 1 C AG AT C T GAT AT C AAC C C 

The resulting plasmid was designated pLSG22. 

15 B, Construction of Plasmid PLSG23 

Plasmid pLSG22 was mutagenized to introduce BstXI and As^I (Csp45I) 
restriction sites at the ATG start codon of the coding sequence of the Tth Pol I gene. In 
addition, codons 2, 3, and 5-7 were altered to be more AT-rich without changing the 
amino acid sequence of the resulting protein. The mutagenizing oligonucleotide was 

20 DG189, depicted below: 

£££XI 

DG1 8 9 5 ' rHnnnnTTTGGG TTCGAA TAATGGTAACATAGCTCCC ATTAAT TTGGGCCACCTGTCCCCG 
3 T 

2 5 Tth T TCAAAGAGCGGAAGCATCGCCTCCAT 

Codon 9 8 7 6 5 4 3 2 1 

The resulting plasmid was designated pLSG23. Transformants harboring plasmid 
pLSG23 were identified by their AmpR phenotype and by hybridization with 
oligonucleotide DG 118, which has the structure shown below: 
30 DG118 5' TGGTAACATAGCTTCCAT 3' 

£L Constriction of Plasmid pLSG24 

Plasmid pLSG22 was mutagenized to introduce B§tXI and Ndel restriction sites 
at the ATG start codon of the coding sequence of the Tth Pol I gene. In addition, 
codons 2, 3, and 5-7 were altered to be more AT-rich without changing the amino acid 
35 sequence of the encoding protein. The mutagenizing oligonucleotide was DG190, 
depicted below. 
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-£SjlXI 

Asu II 

DG1 90 5 1 CCGGCCTTTGG GTTCGAAT AATGGTAACATAGCTT CCATATG TTTGGGCnACCTGTCCrCG 

5 Tth TTCAAAGAGCGGAAGCATCGCCTCCAT 
Codon 9 8 7 6 5 4 3 2 1 

The resulting plasmid was designated pLSG24. Transformants harboring plasmid 
pLSG24 were identified by their AmpR phenotype and by hybridization with 
oligonucleotide DG118. 

10 Example 5 

Constructi on of Plasmids pLSG27 and pLS28 



A. Construction of Plasmid pBSMrTthAStuI/HindlH 

Plasmids pLSG27 and pLSG28 are Tth Pol I expression vectors that drive 
expression of a truncated form of Tth Pol I. The truncation is an -80 codon deletion 

15 from the amino-terminal-encoding region of the coding sequence for Tth Pol I. To 
construct these vectors, plasmid pBSM:Tth5 f was first digested to completion with 
restriction enzymes and EtindlH. The digested plasmid DNA was then treated with 
Klenow enzyme in the presence of all four dNTPs and recircularized by ligation. This 
treatment deleted the 5 r noncoding region through codon 78 (the StuI site spans codons 

20 77-79) of the Tth Pol I gene. Plasmid pBSM:Tth5' also lacks the 3' end of the Tth Pol 
I coding sequence. The resulting plasmid was designated pBSMrTthAStuI/Hindlll. 

IL Construction of Plasmid pLSG25 

Plasmid pBSM:TthAStuI/Hindin was mutagenized with oligonucleotide 
DG191 as described above to yield plasmid pLSG25. In plasmid pLSG25, the 
25 truncated Tth Pol I coding sequence is placed in position for expression from the lac 
promoter. In addition, the IgeZa coding sequence is deleted, and an Ase l restriction 
enzyme recognition site is placed at the ATG start of the truncated coding sequence. 
The DG191 mutagenizing linker has the following structure: 

DG191 5'- 

30 CCTCCCCGCCTTGTAGGCCATTAATT^ 

Transformants harboring plasmid pLSG25 were identified by their AmpR phenotype 
and by hybridization with oligonucleotide DG193, which has the following structure: 
DG 193 S-TTTGGTCTCCTGTGTG-S ' 
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C. Construction of Plasmid pLSG26 

Plasmid pLSG26 was constructed in the same manner as plasmid pLSG25, 

except that the mutagenizing linker was DG192 as opposed to DG191- DG192 has the 

following structure: 

5 DG192 5^ 

CCTCCCCGCCTTGT AGiGCCATATGTTTGGTCTCCTC ' 

Plasmid pLSG26 is identical to plasmid pLSG25, except that an £j£l£l, as opposed to 

AseL restriction enzyme recognition site spans the ATG start codon of the truncated 

coding sequence. Transformants harboring plasmid pLSG26 were identified by their 

10 AmpR phenotype and by hybridization with oligonucleotide DG193. 

D. Final Construction of Plasmids pLSG 27 and pLSG28 

As noted above, plasmid pBSM:Tth5* lacks the 3' end of the Tth Pol I coding 
sequence, so plasmids pLSG25 and pLSG26 also lack this sequence. To place this 3' 
end of the Tth Pol I coding sequence in plasmids pLSG25 and pLSG26 in the correct 
15 reading frame, each plasmid was digested to completion with restriction enzymes 
BamH I and EcoRI. The large EcoRI -BamH I fragment of plasmid pLSG25 was then 
ligated with the ~1.2 kb BamH I- Eco RI restriction fragment of plasmid pLSG22 to yield 
plasmid pLSG27, The -1 .2 kb BamH I-EcoRI restriction fragment of plasmid pLSG22 
contains the 3' end of the Tth Pol I coding sequence. In a similar fashion, plasmid 
pLSG26 was digested with restriction enzymes BamHI and EcoRI and ligated with the 
-1.2 kb BamH I-EcoRI restriction fragment of plasmid pLSG22 to yield plasmid 
pLSG28. Both plasmids pLSG27 and pLSG28 drive low level expression in E. coH of 
a truncated form of Tth Pol I with polymerase activity . 

Example 6 

Construction of Plasmids pLSG29 Throu gh pLSG34 
Although the las promoter in plasmids pBSM:Tth, pLSG21, pLSG22, 
pLSG23, pLSG24, pLSG27, and pLSG28 drives expression of Tth Pol I activity in E; 
coli . one of skill in the art recognizes that utilization of a stronger promoter than the iac 
promoter might increase Tth Pol I expression levels. One well known, powerful 
promoter is the Pl promoter from phage ?l In addition, higher expression levels and 
more efficient production can be achieved by altering the ribosome-binding site, 
transcription termination sequences, and origin of replication (or associated elements) 
of the Tth Pol I expression vector. This example illustrates how such changes can be 
made by describing the construction of expression vectors in which the XPl promoter 
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and either the bacteriophage T7 gene 10 or X gene N ribosome-binding site are 
positioned for expression of Tth Pol I. 

A, Construction of Expression Vectors pDG160 and pDG161 

Plasmid pDG160 is a APl cloning and expression vector that comprises the XPl 
5 promoter and gene N ribosome-binding site (see U.S. Patent No. 4,7 1 1,845, 
incorporated herein by reference), a restriction site polylinker positioned so that 
sequences cloned into the polylinker can be expressed under the control of the APl~ 
Nrbs* and a transcription terminator from the Bacillus thuringiensis delta-toxin gene 
(see U.S. Patent No. 4,666,848, incorporated herein by reference). Plasmid pDG160 

10 also carries a mutated RNAI I gene, which renders the plasmid temperature sensitive for 
copy number (see U.S. Patent No. 4,631,257, incorporated herein by reference). 

These elements act in concert to make plasmid pDG160 a very useful and 
powerful expression vector. At 30-32°C, the copy number of the plasmid is low, and 
in an host cell that carries a temperature-sensitive X repressor gene, such as cI857, the 

15 p L promoter does not function. At 37-41°C, however, the copy number of the plasmid 
is 25-50-fold higher than at 30-32°C, and the cI857 repressor is inactivated, allowing 
the Pl promoter to function. Plasmid pDG160 also carries an ampicillin resistance 
(AmpR) marker. Plasmid pDG161 is identical to plasmid pDG160, except the AmpR 
marker is replaced with a TetR (tetracycline resistance) marker. 

20 So > plasmids pDG160 and pDG161 comprise the AmpR or TetR marker, the 

XPl promoter, the gene N ribosome-binding site, a polylinker, the BT £iy PRE (BT 
positive retroregulatory element, U.S. Patent No. 4,666,848) in a ColEl cop* vector. 
These plasmids were constructed from previously described plasmids and the duplex 
synthetic oligonucleotide linkers DG31 and DG32. The DG3 1/32 duplex linker 

25 encodes a 5' Ifindin cohesive end followed by SacL NcoL KpnFAsp7 18, Xmal/Smal 
recognition sites and a 3' B^nHI cohesive end. This duplex linker is shown below. 

S&ZLl Nco l Xma l 
DG31 5 ' -AGCTTATGAGCTC CATGGTACGCCGGG 

ATACTCGAGGTACCATGGGGCCCCTAG-5 ■ DG3 2 

30 This duplex linker and plasmid pFC54.t were used to construct plasmid pDG 1 60. 

Plasmid pFC54.t, a 5.96 kb plasmid described in U.S. patent No. 4,666,848, 
supra , and available in E. qqH K12 strain DG95 carrying the prophage ?iN7N53cI857 
SusPgo from the ATCC under accession number ATCC 39789, was digested with 
restriction enzymes Hind TTT and BamH I. and the isolated vector fragment was ligated 

35 with a 5-fold molar excess of nonphosphorylated and annealed DG3 1/32 duplex. 
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FoUowing ligation, the DNA was digested with Xhal (to inactivate the vector pFC54.t 
DNA fragment the linker replaces) and used to transform E. £qK K12 strain DG1 1 6 
(ATCC 53,606) to ampicillin resistance. Colonies were screened by restriction enzyme 
digestion for loss of the des-ala-ser 125 JL-2 mutein sequence and acquisition of the 
5 DG3 1/32 polylinker sequence. The polylinker region in the plasmid, designated 
pDG160, of one AmpR transformant was sequenced to verify that the desired 
construction was obtained. 

Plasmid pAW740CHB (available in E. £Q& strain K12 DG116 from the ATCC 
under accession number ATCC 67,605), the source of a modified tetracycline 

10 resistance gene in which the BamH I and ffindin restriction sites were eliminated, and 
which contains the XPl promoter gene N ribosome-binding site, and BT £ry PRE in a 
ColEl cop te vector was digested to completion with restriction enzymes Hindin and 
BamH I and the 4.19 kb vector fragment purified by agarose gel electrophoresis. The 
purified vector DNA fragment was ligated with a 5-fold molar excess of 

15 nonphosphorylated annealed DG31/32 duplex. E. soli K12 strain DG 1 16 was 

transformed with a portion of the DNA, and TetR colonies screened for presence of 4.2 
kb plasmids. Several transformants were further screened by DNA restriction enzyme 
digestion and by sequence analysis of the polylinker region by the Sanger method. 
Several transformants contained a plasmid with the desired sequence, and the plasmid 

20 was designated pDG 161. 

]L Construction of Expression Plasmids n DG164 Through pDGlS l 

To facilitate construction of Tth expression vectors and to increase the efficiency 

of translation initiation, plasmids pDG160 and pDG161 were altered to introduce 

changes in the ?lPl promoter and ribosome-binding site (RBS) region. In these 
25 alterations, plasmids pDG 160 and pDG161 were digested with restriction enzymes 

BspMP and gad and then ligated with short, synthetic linkers to create plasmids in 

which the small BspMII-SacI restriction fragment of plasmid pDG160 (or pDG161 ) 

was replaced with the duplex linker. 

The duplex linkers used in these constructions had different structures and 
30 properties. Duplex DG106/DG107 encodes the bacteriophage T7 gene 10 RBS and an 

Ndel restriction enzyme recognition site at the ATG start codon and has the structure: 

DG106 5 1 -CCGGAAGAAGGAGATATACftTATGAGCT" 3 1 

DGi 07 3 1 -TTCTTCCTC TAT ATG TAT AC - 5 1 

35 Duplex DG108/DG109 encodes a modified T7 gene 10 RBS and an A§el restriction 

enzyme recognition site at the ATG start codon and has the structure: 
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DG1 08 5 1 -CCGGAAGAAGGAGAAA AATT AATG AGCT- 3 ' 

DG10 9 3 1 — TTCTTCCTCTTTTTAATTAC— 5 1 

Duplex DG110/DG1 1 1 encodes the X Nrbs and an Ndel restriction enzyme recognition 

site at the ATG start codon and has the structure: 

5 Nde l 

DG11 0 5 ? -CCGGAGGAGAAA AC AT ATGA GC T- 3 ■ 

DG111 3 f -TCCTCTTTTGTATAC- 5 f 

Duplex DG1 12/DG1 13 encodes the Nrbs and an Asel restriction enzyme recognition 
site at the ATG start codon and has the structure: 
10 Asei 

DG112 5 1 — CCGGAGG AG AA AAT T AAT G AGC T— 3 ' 

DG113 3 1 — TCCTCTTTTAATTAC— 5 1 

The duplexes and B^ME-S^I-digested plasmids pDG160 and pDG161 were 
ligated as shown in tabular form below to yield plasmids pDG164 through pDG17 1. 



15 BsEMH-SasI Constructed 

Digested Vector Duplex Plamid 

pDG160 DG106/DG107 pDG164 

pDG160 DG108/DG109 pDG166 

pDG160 DG110/DG111 pDG168 

20 pDG160 DG112/DG113 pDG170 

pDG161 DG106/DG107 pDG165 

pDG161 DG108/DG109 pDG167 

pDG161 DG110/DG111 pDG169 

pDG161 DG112/DG113 pDG171 



25 These vectors, together with plasmids pDG160 and pDG161 r were also modified, prior 
to inserting the Tth Pol I gene coding sequence, to yield plasmids pDG172 through 
pDG181. 

This modification resulted in the destruction of the Csp4 5I fAsuI D restriction 
enzyme recognition site in plasmids pDG160, pDG161, and pDG164 through 

30 pDG171. Many of the vectors of the invention comprise a Csp4 5I site at the 5' end of 
the Tth Pol I coding sequence. These £§£45I-deleted vectors serve as convenient 
vectors for cloning fragments generated with restriction enzyme Csp 45I or AsuIL This 
£§p45I site is located in the coliciniMM gen e of the plasmids and was deleted by 
digesting with restriction enzyme £sb45I, treating the Csp4 5I-digested DNA with 

35 Klenow enzyme in the presence of all four dNTPs to obtain blunt-ended, double- 
stranded DNA, and recircularizing the plasmid DNA by ligation. The resulting 
plasmids, designated pDG172 through pDG181, are shown in tabular form below. 
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10 



Starring Plasmid 

pDG160 
pDG161 
pDG164 
pDG165 
pDG166 
pDG167 
pDG168 
pDG169 
pDG170 
pDG171 



Designation After 
Csp45I Site Removal 

pDG172 
pDG173 
pDG174 
pDG175 
pDG176 
pDG177 
pDG178 
pDG179 
pDG180 
pDG181 



Plasmids pDG172 through pDG181 were then used to place the Tth Pol I gene of the 
present invention in frame for expression under the control of the XPl promoter. 

15 £^ Construction of Tth Pol T Expression Vectors p lSG29 Thronyh nLSG36 
The Tth Pol I gene can be cloned into expression vectors pDG172 through 
pDGl 81 to create Tth Pol I expression vectors. Several illustrative constructions are 
shown in tabular form below- 



Starting 
20 Plasmid 



pDG174 
pDG174 
pDG175 
pDG177 
pDG178 
pDG178 
pDG179 
pDGlSl 



25 



30 



Ndel-BamHI 
l^I-BamEn 
Ndel -BamH I 
Asel -Bamffl 
Ndel-BamHI 

Ndel-BamHI 
Aj^-BjtmHI 



Sourseof Tth Pol I 
Coding Sequence 

Restriction Fragment 
Restriction Fragment 
Restriction Fragment 
Restriction Fragment 
Restriction Fragment 
Restriction Fragment 
Restriction Fragment 
Restriction Fragment 



ofpLSG24 
ofpLSG28 
ofpLSG24 
ofpLSG23 
ofpLSG24 
ofpLSG28 
ofpLSG24 
ofpLSG23 



Tth Pol I 
Plasmid Expression 

pLSG31 
pLSG35 
pLSG32 
pLSG29 
pLSG33 
pLSG36 
pLSG34 
pLSG30 



Expression vectors pLSG29 through pLSG36 were transformed into E. coli K12 strain 
DG116 and cultured under conditions that allow for expression of Tth Pol I. All 
transformants yielded about the same amounts of activity, although vectors with the 
Nrbs may yield somewhat higher levels of activity than vectors with the T7rbs- The 
XPL promoter vectors also produced Tth Pol I at levels at least an order of magnitude 
higher than the las promoter expression vectors. 
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Example 7 

Synthesis of Recombinant Tth Pol I Activity in E. coli 
E. sob K12 strain DG1 16 (ATCC 53,606) harboring Tth Pol I expression 
plasmids with the APl promoter was grown at 32°C in Bonner-Vogel minimal salts 
5 media containing 0.5% glucose, 10 jig/ml thiamine, 0.25% (w/v) Difco casamino 
acids, and ampicillin (100 |ig/ml) or tetracycline (10 M-g/ml) as appropriate. Cells were 
grown to an Aeoo of about 0.8 and shifted to 37°C to derepress the APl promoter 
(inactivation of cI857 repressor) and increase the copy number of the ColE 1 copts 
plasmid vector. After six to nine hours of growth at 37°C, aliquots of the cells were 
10 harvested, the cells centrifuged, and the pellets stored at -70°C 

Alternatively, E. qqIx K12 strain KB2 (ATCC 53,075) harboring a Tth 
expression plasmid under the control of the jiE promoter/operator can be grown for 
eighthours at 32°C in Bonner- Vogel minimal salts media containing 0.5% glucose, 5 
p,g/ml tryptophan, 10 |Hg/ml thiamine, 0.25% Difco casamino acids, and 100 |ig/ml 
15 ampicillin or 10 |ig/ml tetracycline to an A 6 oo of 3.0. Cells were harvested as above. 

Cell pellets were resuspended to 5 to 10 OLD. units/ml in 50 mM Tris-Cl, pH 
7.5, 1 mM EDTA, 2.4 mM PMSF, and 0.5 |ig/ml leupeptin and lysed by sonication. 
Aliquots of the sonicated extracts were subjected to SDS -PAGE and analyzed by 
Coomassie staining and Western immunoblotting with rabbit polyclonal anti-Taq 
20 polymerase antibody. In addition, portions of the extracts were assayed in a high 
temperature (74"C) DNA polymerase assay. 

Western immunoblotting showed significant induction and synthesis of an 
approximately 94 kDa Tth DNA polymerase polypeptide in induced strains harboring 
Tth expression plasmids. Coomassie blue staining of SDS-PAGE-separated total cell 
25 protein revaled the presence of a new predominant protein at -94 kDa in these induced 
strains. Finally, high temperature activity assays confirmed the significant level of 
recombinant Tth DNA polymerase synthesis in these E. qq& strains. 

Example 8 
PCR with Tth DNA Polymerase 
30 About 1.25 units of the Tth DNA polymerase purified in Example 1 were used 

to amplify rRNA encoding sequences from Tth genomic DNA. The reaction volume 
was 50 and the reaction mixture contained 50 pmol of primer DG73, 10 5 to 10 6 
copies of the Tth genome (-2 x 105 copies of genome/ng DNA), 50 pmol of primer 
DG74, 200 jiM of each dNTP, 2 mM MgCl 2 , 10 mM Tris-HCI, pH 8.3, 50 mM KC1, 
35 and 100 \ig/wl gelatin (although gelatin can be omitted). 
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The reaction was carried out on a Perkin-Elmer Cetus Instruments DNA 
Thermal Cycler. Twenty to 30 cycles of 96°C for 15 seconds; 50°C for 30 seconds, 
and 75°C for 30 seconds, were carried out At 20 cycles, the amplification product 
(160 bp in size) could be faintly seen on an ethidium bromide stained gel, and at 30 
5 cycles, the product was readily visible (under UV light) on the ethidium bromide 
stained gel. 

The PCR may yield fewer non-specific products if fewer units Cue., 0,3 1 U/50 
|jj reaction) of Tth are used* In addition, the addition of a non-ionic detergent, such as 
laureth-12, to the reaction mixture to a final concentration of 1% can improve the yield 
IQ of PCR product. 

Primers DG73 and DG74 are shown below: 

DG73 5 1 TACGTTCCCGGGCCTTGTAC 3 ? 
DG7 4 5' AGGAGGTGATCCAACCGCA 3 * 
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In the Claims 

1 . A recombinant DNA sequence that encodes Thermus thermophilic DNA 
polyennase I activity, 

2 . The DNA sequence of Claim 1 that can be isolated from plasmid 
pBSM:Tth. 

3 . The DNA sequence of Claim 1 that encodes the amino acid sequence, from 
amino to carboxy terminus: 

MetGluAlaMetLeuProLeuPheGluProLysGlyArgValLeuLeuValAspGlyHis 
HisLeuAlaTyrArgThrPhePheAlaLeuLysGlyLeuThrThrSerArgGlyGluPro 
ValGlnAlaValTyrGlyPheAlaLysSerLeuLeuIiysAlaLeuLysGluAspGlyTyr 
LysAlaValPheValValPheAspAlaLysAlaProSerPheArgHisGluAlaTyrGlu 
AlaTyrLysAlaGlyArgAlaProThrProGluAspPheProArgGlnLeuAlaLeuIle 
LysGluLeuValAspLeuLeuGlyPheThrArgLeuGluValProGlyTyrGluAlaAsp 
AspValLeuAlaThrLeuAlaLysLysAlaGluLysGluGlyTyrGluValArglleLeu 
ThrAlaAspArgAspLeuTyrGlnLeuValSerAspArgValAlaValLeuHisProGlu 
GlyHisLeuXleThrProGluTrpLeuTrpGluLysTyrGlyLeuArgProGluGlriTrp 
ValAspPheArgAlaLeuValGlyAspProSerAspAsnLeuProGlyValLysGlylle 
GlyGluLysThrAlalieuIiysLeuIieuLysGluTrpGlySerLeuGluAsnLeuLeuLys 
AsnLeuAspArgValLysProGluAsnValArgGluLysIleLysAlaHisLeuGluAsp 
LeuArg^euSerLeuGluLeuSerArgValArgThrAspLeuProLeuGluValAspLeu 
AlaGlnGlyArgGluProAspArgGluGlyLeiaArgAlaPheLeuGluArgLeuGluPtie 
GlySerLeuLeuHisGluPheGlylieuLeuGluAlaProAlaProLeuGluGluAlaPro 
TrpProProProGluGlyAlaPheValGlyPheValLeuSerArgProGluProMetTrp 
AlaGluLeuLysAlaLetiAlaAlaCysArgAspGlyArgValHisArgAlaAlaAspPro 
LeuAlaGlyLeuLysAspLeuLysGluValArgGlyLeuIieuAlalxysAspLeuAlaVal 
LeuAlaSerArgGluGlyLeuAspLeuValProGlyAspAspProMetLeuLeuAlaTyr 
LeuLeuAspProSerAsnThrThrProGluGlyValAlaArgArgTyrGlyGlyGluTrp 
ThrGlxiAspAlaAlaHisArgAlaLeuLeuSerGluArgLeuHisArgAsnLeuLeuLys 
ArgLeuGluGlyGluGluLysLeuLeuTrpLeuTyrHisGluValGluLysProLeuSer 
ArgValLeuAlaHisMetGluAlaThrGlyValArgLeuAspValAlaTyrLeuGlnAla 
LeuSerLeuGluLeuAlaGluGluIleArgArgLeuGluGluGluValPheArgLeuAla 
GlyHisProPheAsnLeuAsnSerArgAspGlnLeuGluArgValLeuPheAspGluLeu 
ArgLeuProAlaLeuGlyLysThrGlnLysThrGlyLysArgSerThrSerAlaAlaVal 
LeuGluAlaLeuArgGluAlaHisProIleValGluLysIleLeuGlnHisArgGluLeu 
ThrLysLeuLysAsnThrTyrValAspProLeuProSerLeuValHisProArgThrGly 
ArgLeuHisThrArgPheAsnGlnThrAlaThrAlaThrGlyArgLeuSerSerSerAsp 
ProAsnLeuGlnAsnXleProValArgThrProIieuGlyGlnArglleArgArgAlaPhe 
ValAlaGluAlaGlyTrpAlaLeuValAlaLeuAspTyrSerGlnlleGluLeuArgVal 
LeuAlaHisLeuSerGlyAspGluAsnLeuIIeArgValPheGlnGluGlyLysAspIie 
HisThrGlnThrAlaSerTrpMetPheGlyValProProGluAlaValAspProLeuMet 
ArgArgAlaAlaLysThrValAsnPheGlyValLeuTyrGlyMetSerAlaHisArgLeu 
SerGlnGluIieuAlalleProTyrGluGluAlaValAlaPheXleGluArgTyrPheGln 
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SerPheProLysValArgAlaTrpIleGluLysThrLeuGluGliaGlyArgLysArgGly 
TyrValGluThrLeuPheGlyArgArgArgTyrValProAspIjeuAsnAlaArgValLys 
SerValArgGluAlaAlaGluArgMetAlaPheAsnMetProValGlnGlyThrAlaAla 
As pLeuMe t Ly s Leu Al aMe t Va ILy s LeuPheP r o ArgLeu Ar gGluMe t GlyAl a Ar g 
MetLeuLeuGlnValHisAspGluLeuLeuLeuGluAlaProGlnAlaArgAlaGluGlu 
ValAlaAlaLeuAlaLysGluAlaMetGluLysAlaTyrProLeuAlaValProLeuGlu 
ValGluValGlyMetGlyGluAspTrpLeuSerAlaLysGly 

4. The DNA seqence of Claim 3 that is 

5 1 - ATGGAGGCGA TGCTTCCGCT CTTTGAACCC AAAGGCCGGG TCCTCCTGGT 
GGACGGCCAC CACCTGGCCT ACCGCACCTT CTTCGCCCTG AAGGGCCTCA 
CCACGAGCCG GGGCGAACCG GTGCAGGCGG TCTACGGCTT CGCCAAGAGC 
CTCCTCAAGG CCCTGAAGGA GGACGGGTAC AAGGCCGTCT TCGTGGTCTT 
TGACGCCAAG GCCCCCTCCT TCCGCCACGA GGCCTACGAG GCCTACAAGG 
CGGGGAGGGC CCCGACCCCC GAGGACTTCC CCCGGCAGCT CGCCCTCATC 
AAGGAGCTGG TGGACCTCCT GGGGTTTACC CGCCTCGAGG TCCCCGGCTA 
CGAGGCGGAC GACGTTCTCG CCACGCTGGC CAAGAAGGCG GAAAAGGAGG 
GGTACGAGGT GCGCATCCTC ACCGCCGACC GCGACCTCTA CCAACTCGTC 
TCCGACCGCG TCGCCGTCCT CCACCCCGAG GGCCAC'CTCA TCACCCCGGA 
GTGGCTTTGG GAGAAGTACG GCCTCAGGCC GGAGCAGTGG GTGGACTTCC 
GCGCCCTCGT GGGGGACCCC TCCGACAACC TCCCCGGGGT CAAGGGCATC 
GGGGAGAAGA CCGCCCTCAA GCTCCTCAAG GAGTGGGGAA GCCTGGAAAA 
CCTCCTCAAG AACCTGGACC GGGTAAAGCC AGAAAACGTC CGGGAGAAGA 
TCAAGGGCCA CCTGGAAGAC CTCAGGCTCT CCTTGGAGCT CTCCCGGGTG 
CGCACCGACC TCCCCCTGGA GGTGGACCTC GCCCAGGGGC GGGAGCCCGA 
CCGGGAGGGG CTTAGGGCCT TCCTGGAGAG GCTGGAGTTC GGCAGCCTCC 
TCCACGAGTT CGGCCTCCTG GAGGCCCCCG CCCCCCTGGA GGAGGCCCCC 
TGGCCCCCGC CGGAAGGGGC CTTCGTGGGC TTCGTCCTCT CGGGCCCCGA 
GCCCATGTGG GCGGAGCTTA AAGCCCTGGC CGCCTGCAGG GACGGCCGGG 
TGCACCGGGC AGCAGACC'CC TTGGCGGGGC TAAAGGACCT CAAGGAGGTC 
CGGGGCCTCC TCGGCAAGGA CCTCGCCGTC TTGGCCTCGA GGGAGGGGCT 
AGACCTCGTG CCCGGGGACG ACCCCATGCT CCTCGCGTAC CTCCTGGACG 
CGTCCAACAC CACCCCCGAG GGGGTGGCGC GGCGCTACGG GGGGGAGTGG 
ACGGAGGACG CCGCCCACCG GGCCCTCCTC TCGGAGAGGC TCCATCGGAA 
CGTCCTTAAG CGCCTCGAGG GGGAGGAGAA GCTCCTTTGG- CTCTACCACG 
AGGTGGAAAA GCCCCTCTCC CGGGTCCTGG CCCACATGGA GGCCACCGGG 
GTACGGCTGG ACGTGGCCTA CCTTCAGGCC CTTTCCCTGG AGCTTGCGGA 
GGAGATCCGC CGCCTCGAGG AGGAGGTCTT CCGC-TTGGCG GGCCACCCCT 
TCAACCTCAA CTCCCGGGAC CAGCTGGAAA GGGTGCTCTT TGACGAGCTT 
AGGCTTCCCG CCTTGGGGAA GACGCAAAAG ACAGGCAAGC GCTCCACCAG 
CGCCGCGGTG CTGGAGGCCC TACGGGAGGC CCACCCCATC GTGGAGAAGA 
TCCTCCAGCA CCGGGAGCTC ACCAAGCTCA AGAACACCTA CGTGGACCCC 
CTCCCAAGCC TCGTCCACCC GAGGACGGGC CGCCTCCACA CCCGCTTCAA 
CCAGACGGCC ACGGCCACGG GGAGGCTTAG TAGCTCCGAC CCCAACCTGC 
AG AAC AT C C C CGTCCGCACC CCCTTGGGCC AGAGGATCCG CCGGGCCTTC 
GTGGCCGAGG CGGGTTGGGC GTTGGTGGCC CTGGACTATA G C C AG AT AG A 
GCTCCGCGTC CTCGCCCACC TCTCCGGGGA CGAAAACCTG ATCAGGGTCT 
TCCAGGAGGG GAAGGACATC CACACCCAGA CCGCAAGCTG GATGTTCGGC 
GTCCCCCCGG AGGCCGTGGA CCCCCTGATG CGCCGGGCGG CCAAGACGGT 
GAACTTCGGC GTCCTCTACG GCATGTCCGC CCATAGGCTC TCCCAGGAGC 
TTGCCATCCC CTACGAGGAG GCGGTGGCCT TTATAGAGCG CTACTTCCAA 
AGCTTCCCCA AGGTGCGGGC CTGGATAGAA AAGACCCTGG AGGAGGGGAG 
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GAAGCGGGGC TACGTGGAAA CCCTCTTCGG AAGAAGGCGC TACGTGCCCG 
ACCTCAACGC CCGGGTGAAG AGCGTCAGGG AGGCCGCGGA GCGCATGGCC 
TTCAACATGC CCGTCCAGGG CACCGCCGCC GACCTCATGA AGCTCGCCAT 
GGTGAAGCTC TTCCCCCGCC TCCGGGAGAT GGGGGCCCGC ATGCTCCTCC 
AGGTCCACGA CGAGCTCCTC CTGGAGGCCC CCCAAGCGCG GGCCGAGGAG 
GTGGCGGCTT TGGCCAAGGA GGCCATGGAG AAGGCCTATC CCCTCGCCGT 
GCCCCTGGAG GTGGAGGTGG GGATGGGGGA GGACTGGCTT TCCGCCAAGG 
GTTAG-3 1 

5 . A recombinant DNA sequence that encodes a protein with thermostable 
DNA polymerase activity, said protein comprising a sequence of amino acids that has 
100% homology to a contiguous sequence of at least five out of nine amino acids 
encoded by the Thermus thermophilus DNA polymerase encoding sequence of Claim 
3, said contiguous sequence of nine amino acids selected from the group consisting of 
codons 238-246, 241-249, 335-343, 336-344, 337-345, 338-346, 339-347. 

6 . A recombinant DNA sequence that encodes a protein with thermostable 
polymerase activity, said protein comprising a sequence of amino acids that has 100% 
homology to a contiguous sequence of at least four out of six amino acids encoded by 
the Thermus thermophilus DNA polymerase encoding sequence of Claim 3 at codons 
225-230. 

7 . A recombinant DNA vector that comprises the DNA sequence of Claim 1 « 

8 . The recombinant DNA sequence of Claim 7 selected from the group 
consisting of plasmids pBSM:Tth, pLSG21, pLSG22, pLSG23, pLSG24, pLSG27, 
pLSG28, pLSG29, pLSG30, pLSG31, pLSG32, pLSG33, pLSG34, pLSG35, and 
pLSG36. 

9 . The recombinant DNA sequence of Claim 8 that is plasmid pBSM:Tth. 

10. A recombinant DNA vector selected from the group consisting of plasmids 
pBSM:Tth5', pBSM:TthAStuI/Hindffl, pLSG25, and pLSG26. 

1 1 . A recombinant host cell transformed with a vector of Claim 7. 

12. The recombinant host cell of Claim 1 1 that is E. coli . 
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13. The recombinant host cell of Claim 12, transformed with a vector selected 
from the group consisting of plasmids pBSM:Tth, pLSG21, pLSG22, pLSG23, 
pLSG24, pLSG27, pLSG28, pLSG29, pLSG30, pLSG31, pLSG32, pLSG33, 
pLSG34, pLSG35, and pLSG36. 

14. The recombinant host cell of Claim 12 that is E. coli K12/pBSM:Tth. 

1 5. A method for purifying Thermos thermophilus DNA polymerase I from 
T. thermophilus cells, said method comprising: 

(a) preparing a crude cell extract from said cells; 

(b) adjusting the ionic strength of said extract so that said polymerase 
dissociates from any nucleic acid in said extract; 

(c) subjecting the extract to hydrophobic interaction chromatography; 

(d) subjecting the extract to DNA binding protein affinity chromatography; 

(e) subjecting the extract to nucleotide binding protein affinity chromatography; 

and 

(f) subjecting the extract to chromatography selected from the group consisting 
of anion exchange, cation exchange, and hydroxyapatite chromatography. 
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